POLYPEPTIDES THAT BIND HIV gpl20 AND RELATED NUCLEIC 
ACIDS, ANTIBODIES, COMPOSITIONS, AND METHODS OF USE 

TECHNICAL FIELD OF THE INVENTION 
The present invention relates to polypeptides with 
homology to regions of domains of the human chemokine 
receptors CCR5, CXCR4, and STRL33 , as well as domains of 
CD4 that bind with human immunodeficiency virus (HIV) , in 
particular HIV-1 glycoprotein 120 (gpl20) envelope . 
protein. The present invention also relates to nucleic 
acids encoding such polypeptides, antibodies, 
compositions comprising such polypeptides, nucleic acids 
or antibodies, . and methods of using the same. 

BACKGROUND OF THE INVENTION 
There are seven transmembrane chemokine receptors 
that act as cof actors for HIV infection. The cof actors 
enable entry of HIV-1 into CD4 + T cells and macrophages 
(Premack et al . , Nature Medicine 2: 1174-78 (1996); and 
Zhang et al . , Nature 383 i 768 (1996)). 

The presence of chemokines has an inhibitory effect 
on HIV-1 attachment to, and infection of, susceptible 
cells. Additionally, some mutations in chemokine 
receptors have been shown to result in resistance to 
HIV-1 infection.. For example, a 32-nucleotide deletion 
within the CCR5 gene has .been described in subjects who 
remained uninfected despite repeated exposures to HIV-1 
(Huang et al., Nature Medicine 2: 1240-43 (1996)). 

Evidence also exists for the physical association of 
a ternary complex between chemokine receptors, CD4, and 
HIV-1 gpl20 envelope glycoprotein on cell membranes 
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(Lapham et al., Science 274: 602-05 (1996)). Receptor 
signaling and cell activation are probably not required 
for the anti-HIV-1 effect of chemokines since a RANTES 
analog lacking the first eight amino -terminal amino 
5 acids, RANTES (9-68), lacked chemotactic and leukocyte- 
activating properties, but bound to multiple chemokine 
receptors and inhibited infection by macrophage -tropic 
HIV-1 (Arenzana-Seladedos et al., .Nature 383 : 400 
(1996)). Cumulatively, the above described results 
10 suggest that the interaction between gpl20 , CD4, and at 
least one chemokine receptor is obligatory for HIV-1 
infection. Accordingly, reagents that interfere with the 
binding of gpl20 to chemokine receptors and to CD4 are 
used in the biological and medical arts. However, there 
15 presently exists a need for additional reagents that can 
compete with one or more proteins of the 

gpl20-CD4-chemokine receptor complex to assist in basic 
biological or viral research, and to assist in medical 
intervention in the HIV-1 pandemic. It is an object of 
20 the present invention to provide such reagents. This and 
other objects and advantages, including additional 
inventive features, will be apparent from the description 
provided herein. 

25 BRIEF SUMMARY OF THE INVENTION 

The present invention provides a polypeptide that 
binds with HIV gpl20 under physiological conditions. 
Multiple embodiments of the present inventive polypeptide 
are provided, and each embodiment possesses a degree of 

30 homology to at least one of the human CCR5 , CXCR4 and 



STRL33 chemokine receptors, and the human CD4 cell- 
surface protein. 

In a first embodiment, the present invention 
provides a polypeptide comprising the amino acid sequence 
YDIXYYXXE, wherein X is any synthetic or naturally 
occurring amino acid residue, and the polypeptide 
comprises less than about 100 contiguous amino acids that 
are identical to, or, in the alternative, substantially 
identical to, the amino acid sequence of the human CCR5 
chemokine receptor. A preferred polypeptide of this 
first embodiment comprises the amino acid sequence 
YDIN*YYT*S*E. A more preferred, polypeptide of this first 
embodiment comprises the amino acid sequence YDINYYTSE, 
wherein each letter is the standard one-letter 
abbreviation for an amino acid residue (i.e., for 
example, N denotes asparaginyl, T denotyes threoninyl, 
and S denotes serinyl) . The polypeptide of the first 
embodiment can comprise the amino acid sequence 
M*D*YQ*V*S*SP*IYDIN*YYT*S*E. Preferably, the polypeptide 
comprises the amino acid sequence MDYQVSSPIYDINYYTSE . 

In a second embodiment, the present invention 
provides a polypeptide comprising the amino acid sequence 
XEXIXIYXXXNYXXX, wherein X is any synthetic or naturally 
occurring amino acid and wherein said polypeptide 
comprises ( less than about 100 contiguous amino acid that 
are identical to or substantially identical to the amino 
acid sequence of the human CXCR4 chemokine receptor. The 
polypeptide can consist essentially of, or consist of, 
the sequence EXIXIYXXXNY. Preferably, the polypeptide 
comprises the sequence M*EG*IS*IYT*S*D*NYT*E*E* . 



Preferably, M*EG*IS*IYT*S*D*NYT*E*E* is 
M*EGISIYTSDNYT*E*E* . 

In a third embodiment, the present invention 
provides a polypeptide comprising the amino acid sequence 
EHQAFLQFS, wherein said polypeptide comprises less than 
about 100 contiguous amino acids that are identical to or 
substantially identical to the amino acid sequence of the 
human STRL33 chemokine receptor. The polypeptide can 
consist essentially Of, or consist of, the sequence 
EHQAFLQFS. 

In a fourth embodiment, the present invention 
provides a polypeptide comprising at least a portion of 
an amino acid sequence selected from the group consisting 
Of LPPLYSLVFIFGFVGNML, QWDFGNTMCQLLTGLYFIGFFS , 
SQYQFWKNFQTLKIVILG, APYNI VLLLNTFQEFFGLNNCS , and 
YAFVGEKFRNYLLVFFQK, wherein said polypeptide comprises 
less than about 100 contiguous amino acids that are 
identical to or substantially identical to the amino acid 
sequence of the human CCR5 chemokine receptor. 

In a fifth embodiment, the present invention 
provides a polypeptide comprising at least a portion of 
an amino acid sequence selected from the group consisting 
of LLLTIPDFIFANVSEADD, WFQFQHIMVGLILPGIV, and 
IDSFILLEIIKQGCEFEN, wherein said polypeptide comprises 
less than about 100 contiguous amino acids that are 
identical to or substantially identical to the amino acid 
sequence of the human CXCR4 chemokine receptor. 

In a sixth embodiment, the present invention 
provides a polypeptide comprising at least a portion of 
an amino acid sequence selected from the group consisting 
of LVT S I FYHKLQSLTDVFL , . PFWAYAGIHEWVFGQ VMC , 



EAI STWLATQMTLGFFL , LTMI VCYS VI I KTLLHAG , 
MAVFLLTQMPFNLMKFIRSTHW, HWEYYAMTSFHYTIMVTE , 
ACLNPVLYAFVSLKFRKN and SKTFSASHNVEATSMFQL, wherein said 
polypeptide comprises less than about 100 contiguous 
amino acids that are identical to or substantially 
identical to the amino acid sequence of the human STRL33 
Chemokine receptor. 

In a seventh embodiment, the present invention 
provides a polypeptide comprising at least a portion of 
an amino acid sequence selected from the group consisting 
of DTYICEVED, EEVQLLVFGLTANSD , THLLQGQSLTLTLES , and 
GEQVEFSFPLAFTVE, wherein said polypeptide comprises less 
than about 100 contiguous amino acids that are identical 
to or substantially identical to the amino acid sequence 
of the human CD4 cell-surface protein. 

In the fourth to seventh embodiments, any selected 
portion of the polypeptide can comprise from 1 to about 6 
conservative amino acid substitutions. In an 
alternative, the polypeptide can be partially defined by 
an absence of a polypeptide sequence, outside the region 
of the portion selected from the foregoing sequences, 
that has five, or ten, contiguous amino acid residues 
that have a sequence that consists of an amino acid 
sequence that is identical to or substantially identical 
to the protein to which the polypeptide has homology 
(i.e., CCR5, CXCR4, STRL33, or CD4) . In yet another 
alternative, the polypeptide can lack a sequence of five 
or ten contiguous amino acids which are identical to or 
substantially identical to the sequence of the protein 
with which the sequence has homology except that one or 
more conservatively or neutrally substituted amino acids 



replace part of the sequence of the protein to which the 
polypeptide has homology. Additionally, any embodiment 
of the present inventive polypeptide can also comprise a 
pharmaceutical ly acceptable substituent. 

Any embodiment of the present inventive polypeptide 
can be incorporated into a composition, which further 
comprises a carrier. Any suitable embodiment of the 
present inventive polypeptide can be encoded by a nucleic 
acid that can be expressed in a cell. In this regard, 
the present invention further provides a vector 
comprising such a nucleic acid. The nucleic acids and 
vectors also can be incorporated into a composition 
comprising a carrier. 

Additionally, the present invention provides a 
method of making an antibody to a polypeptide of the 
present invention. The present invention also provides a 
method of prophylactically or therapeutically treating an 
HIV infection in a mammal. 

Additionally, the present invention provides an 
anti-idiotypic antibody comprising an internal image of a 
portion of gpl20, as well as a method of selecting such 
an antibody. 

The present invention also provides a method of 
. making an antibody to a portion of the gp!20 protein that 
binds with a portion of CCR5 , CXCR4 , STRL33, or CD4, as 
well as the immunizing compound used to make the 
antibody, and the antibody itself. In another embodiment 
of the present invention, a method of removing HXV-1 from 
a bodily fluid is provided. 



BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 depicts a listing of synthetic amino acids 
available (from Bachem, King of Prussia, PA) for 
incorporation into polypeptides of the present invention. 

DETAILED DESCRIPTION OF THE INVENTION 
The present invention provides a polypeptide that 
binds with gpl20 of HIV, in particular HIV-1, more 
particularly HIV-1^/ under physiological conditions. 
The polypeptide has a number of uses including, but not 
limited to, the use of the polypeptide to elucidate the 
mechanism by which HIV, such as HIV-1, attaches to and/or 
infects a particular cell, to induce an immune response 
in a mammal, in particular a human, to HIV, in particular 
HIV-1, and to inhibit the replication of HIV, in 
particular HIV-l, in an infected mammal, in particular a 
human. 

Multiple embodiments of the present inventive 
polypeptide are provided. Each embodiment of the 
polypeptide has a degree of homology to at least one of 
the human CCR5, CXCR4. and STRL33 chemokine receptors, or 
the human CD4 cell-surface protein. In each embodiment 
provided herein, a letter indicates the standard amino 
acid designated by that letter, and a letter followed 
directly by an asterisk (*) preferably represents the 
amino acid represented by the letter (e.g., N represents 
asparaginyl and T represents threoninyl) , or. a synthetic 
or naturally occurring conservative or neutral 
substitution therefor. Additionally, in accordance with 
convention, all amino acid sequences provided herein are 
given either from left to right, or top to bottom, such 
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that the- first amino acid is amino- terminal and the last, 
is carboxyl-terminal. The synthesis of polypeptides, 
either synthetically (i.e., chemically) or biologically, 
is within the skill in the art. 

It is within the skill of the ordinary artisan to 
select synthetic and naturally occurring amino acids that 
make conservative or neutral substitutions for any 
particular naturally occurring amino acids. The skilled 
artisan desirably will consider the context in which any 
particular amino acid substitution is made,, in addition 
to considering the hydrophobicity or polarity of the 
side-chain, the general size of the side chain, and the 
P K value of side-chains with acidic or basic character 
under physiological conditions. For example, lysine, 
arginine, and histidine are often suitably substituted 
for each other, and more often arginine and lysine. As 
is known in the art, this is because all three amino 
acids have basic side chains, whereas the pK value for 
the side-chains of lysine and arginine are much closer to 
each- other (about 10 and 12) than to histidine (about 6) . 
Similarly, glycine, alanine, valine, leucine, and 
isoleucine are often suitably substituted for each other, 
with the proviso that glycine is frequently not suitably 
substituted for the other members of the group. This is 
because each of these amino acids are relatively 
hydrophobic when incorporated into a polypeptide, but 
glycine's lack of an a-carbon allows the phi and psi 
angles of rotation (around the a-carbon) so much 
conformational freedom that glycinyl residues can trigger 
changes in conformation or secondary structure that do 
not often occur when the other amino acids are 
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substituted for each other. Other groups of amino acids 
frequently suitably substituted for each other include, 
but are not limited to, the group consisting of glutamic 
and aspartic acids; the group consisting of 
phenylalanine, tyrosine and tryptophan; and the group 
consisting of serine, threonine and, optionally, 
tyrosine. Additionally, the skilled artisan can readily 
group synthetic amino acids with naturally occurring 

amino acids. 

In the context of the present invention, a 
polypeptide is "substantially identical" to another 
polypeptide if it comprises at least about 80% identical 
amino acids. Desirably, at least about 50% of the 
non-identical amino acids are conservative or neutral 
substitutions. Also, desirably, the polypeptides differ 
in length (i.e., due to deletion mutations) by no more 

than about 10% . 

In a first embodiment, the present invention 
provides a polypeptide comprising the amino acid sequence 
YDIXYYXXE, wherein X is any synthetic or naturally 
occurring amino acid residue, and the polypeptide 
comprises less than about 100 contiguous amino acids, 
preferably less than about 50 amino acids, more 
preferably less than about 25 amino acids, and yet more 
preferably less than about 13 amino acids that are 
identical to, or, in the alternative, substantially 
identical to, the amino acid sequence of the human CCR5 
chemokine receptor. 

Preferably, the polypeptide of the first embodiment 
comprises YDIXYYXXE, wherein the amino moiety of the 
amino-terminal tyrosinyl residue is not bound to another 



amino acid residue via a peptidic bond, and the carboxyl 
moiety of the glutamyl residue is not bound to another 
amino acid residue via a. peptidic bond. However, the 
polypeptide can consist essentially of YDIXYYXXE and, 
5 optionally, can be modified by one or more 

pharmaceutical^ acceptable substituents, such as, for 
. example, t-boc or a saccharide. 

More particularly, the polypeptide comprises the 
amino acid sequence YDIN*YYT*S*E . Preferably, N* is 
10 asparaginyl, T* is threoninyl , and S* is serinyl. 

The polypeptide of the first embodiment can comprise 
a dodecapeptide. selected from the amino acid sequence 
M*D*YQ*V*S*SP*IYDIN*YYT*S*E. More preferably, the 
~ polypeptide of the first embodiment comprises the amino 
15 acid sequence MDYQVSSPIYDINYYTSE . 

In a second embodiment, the present invention 
. provides a polypeptide comprising the amino acid sequence 
XEXIXIYXXXNYXXX, wherein X is any synthetic or naturally 
occurring amino acid, and the polypeptide comprises less 
20 than about 100 contiguous amino acids, preferably less 

than about 50 amino acids, and more preferably less than 
about 25 amino acids, that are identical to or 
substantially identical to the amino acid sequence of the 
human CXCR4 chemokine receptor. Optionally, the 

• „4-= aocpntiallv of, or consists of, the 
25 polypeptide consists essentially ot, 

sequence EXIXIYXXXNY. 

Itl a preferred polypeptide of this second 
embodiment, the polypeptide comprises the amino acid . 
sequence M *EG*IS*IYT*S*D*NYT*E*E* . Preferably, 
30 M*EG*IS*IYT*S*D*NYT*E*E* is M*EGISIYTSDNYT*E*E* . 
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in a third embodiment, the present invention 
provides a polypeptide comprising the amino acid sequence 
EHQAFLQFS , wherein the polypeptide comprises less than 
about 100 contiguous amino acid residues, preferably less 
than about 50 contiguous amino acid residues, more 
preferably less than about 25 contiguous amino acid 
residues, that are identical to or substantially 
identical to the amino acid sequence of the human STRL33 
chemokine receptor. The polypeptide can consist 
essentially of, or consist of, the sequence EHQAFLQFS. 

The first three embodiments of the present invention 
provide, among other things, polypeptides having 
substantial identity or identity to the amino- terminal 
regions of the chemokine receptors CCR5, CXCR4 , and 
15 STRL33 . These first three embodiments form a first group 
of embodiments of the present invention. The present 
invention also provides, in a second group of 
embodiments, polypeptides having substantial identity or 
identity to an internal region of the human chemokine 
receptors CCR5, CXCR4, and STRL33, as well as to the 
leukocyte cell-surface protein CD4 . 

This second group of embodiments provides a 
polypeptide that binds with HIV gp!20 under physiological 
conditions and comprises at least a portion of or all of 
an amino acid sequence selected from the group consisting 
of LPPLYSLVFIFGFVGNML, QWDFGNTMCQLLTGLYFIGFFS, 
SQYQFWKNFQTLKIVILG, APYNI VLLIiNTFQEFFGIiNNCS , and 
YAFVGEKFRNYLLVFFQK , wherein the polypeptide comprises 
less than about 100 amino acids that are identical to or 
substantially identical to the amino acid sequence of the 
human CCR5 chemokine receptor; or selected from the group 
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consisting of LLLT I PDF I FANVSEADD (165-182), 
WFQFQHIMVGLILPGIV (197-214) , and IDSFILLEI IKQGCEFEN 
(261-278) , wherein the polypeptide comprises less than 
about 100 amino acids that are identical to or 
substantially identical to the amino acid sequence of the 
human CXCR4 chemokine receptor; or 

selected from the group consisting of 
LVISIFYHKLQSLTDVFL (53-70) ,. PFWAYAGIHEWVFGQVMC (85-102), 
EAI STWLATQMTLGFFL (185-202) , LTMIVCYSVI IKTLLHAG (205- . 
222) , MAVFLLTQMPFNLMKFIRSTHW (237-258) , 

HWEYYAMTSFHYTIMVTE (257-274)., ACLNP VLYAFV SLKFRKN (281- 
298) and SKTFSASHNVEATSMFQL (325-342)., wherein the 
polypeptide comprises less than about 100 amino acids 
that are identical to a substantially identical to the 
amino acid sequence of the human STRL33 chemokine 
receptor; or 

selected from the group consisting of DTYICEVED, 
EEVQLLVFGLTANSD, THLLQGQSLTLTLES, and GEQVEFSFPIAFTVE , 
wherein the polypeptide binds with HIV gpl20 under 
physiological conditions and comprises less than about 
100 amino acids that are identical to or substantially 
identical to the amino acid sequence of the human CD4 
cell- surface protein. Optionally, the recited amino acid 
sequences can comprise 1 to about 6 conservative or 
neutral amino acid substitutions. 

The polypeptides of this second group of embodiments 
preferably comprise less than about 50 amino acid 
residues, and more preferably less than about 25 amino 
acid residues, and yet more preferably no additional 
amino acid residues, . that are identical to a protein that 
naturally has the recited amino acid sequence. The 
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polypeptide can be alternatively characterized by an 
absence of a region, outside the above-recited amino acid 
sequences, that has about five, or about ten, contiguous 
amino acid residues that have a sequence that consists of 
an amino identical and conservatively substituted 
residues as an amino acid sequence of the protein to 
which the polypeptide of the compound has homology. 

Any embodiment of the present inventive polypeptide 
can also comprise a pharmaceutical ly acceptable 
substituent, attachment of which is within the skill in 
the art. The pharmaceutically acceptability of 
substituents are understood by those skilled in the art. 
For example, a pharmaceutically acceptable substituent 
can be a biopolymer, such as a polypeptide, an RNA, a 
DNA, or a polysaccharide. Suitable polypeptides comprise 
fusion proteins, an antibody or fragment thereof, a cell 
adhesion molecule or a fragment thereof, or a peptide 
hormone. Suitable polysaccharides comprise polyglucose 
moieties, such as starch and their derivatives, such as 
heparin. The pharmaceutically acceptable substituent 
also can be any suitable . lipid or lipid- containing 
moiety, such as a lipid of a liposome or a vesicle, or 
even a lipophilic moiety, such as a prostaglandin, a 
steroid hormone, or a derivative thereof. Additionally, 
the pharmaceutically acceptable substituent can be a 
nucleotide or nucleoside, such as nicotine adenine 
dinucleotide or thymine,, an amino acid residue, a 
saccharide or disaccharide, or the residue of another 
biomolecule naturally occurring in a cell, such as 
inositol, a vitamin, such as vitamin C, thiamine, or 
nicotinic acid. Synthetic organic moieties also can be 
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pharmaceutically acceptable substituents, such as t-butyl 
carbonyl, an acetyl moiety, quinine, polystyrene and 
other biologically acceptable polymers. Optionally, a 
pharmaceutically acceptable substituent can be selected 
from the group consisting of a C x -C 18 alkyl, a C 2 -C 18 
alkenyl, a ^-C u alkynyl, a C 6 -C 18 aryl, a C 7 -C 18 alkaryl, a 
C 7 -C 18 aralkyl, and a C 3 -C 18 cycloalkyl, wherein any of the 
foregoing moieties that are cyclic comprise from 0 to 2 
atoms per carbocyclic ring, which can be the same or 
different, and are selected from the group consisting of 
nitrogen, oxygen, and sulfur. 

Any of the substituents from this group can be 
substituted by one to six substituent moieties, which can 
be the same or different, selected from the group 
consisting of an amino moiety, a carbamate moiety, a. 
carbonate moiety, hydroxyl, a phosphamate moiety, a 
phosphate moiety, a phosphonate moiety, a pyrophosphate 
moiety, a triphosphate moiety, a sulfamate moiety, a 
sulfate moiety,, a sulfonate moiety, a Cl -C 8 monoalkylamine 
moiety, a C,-C s dialkylamine moiety, and a C.-C, 
trialkylamine moiety. 

Any embodiment of the present inventive polypeptide 
can be encoded by a nucleic acid and can be expressed in 
a cell. The skilled artisan will recognize that the 
encoded polypeptide as well as any pharmaceutically 
acceptable substituent to be incorporated into the 
polypeptide, e.g., a formyl or acetyl substituent on an 
amino-terminal methionine or a saccharide, will 
preferably be produced by a cell that can express the 
, polypeptide of the present invention. Accordingly, the 
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amino acids incorporated into the polypeptide encoded by 
the nucleic acid are preferably naturally occurring. 

A nucleic acid as described above can be cloned into 
any suitable vector and can be used to transduce, 
transform, or transfect any suitable host. The selection 
of vectors and methods to construct them are commonly 
known to persons of ordinary skill in the art and are 
described in general technical references (see, in 
general, -Recombinant DNA Part D, " Methods in Enzymology, 
Vol. 153, Wu and Grossman, eds Academic Press (1987) ) , 
Desirably, the vector comprises regulatory sequences, 
such as transcription and translation initiation and . 
termination codons, which are specific to the type of 
host (e.g., bacterium, fungus, plant, or animal) into 
which the vector is to be inserted, as appropriate and. 
taking into consideration whether the vector is DNA or 
RNA. Preferably, the vector comprises regulatory 
sequences that are specific to the genus of the host. 
Most preferably, the vector comprises regulatory 
sequences that are specific to the species of the host 
and is optimized for the expression of an above-described 
polypeptide . 

' constructs of vectors, which are circular or linear, 
can be prepared to contain an entire nucleic acid 
sequence as described above or a portion thereof ligated 
to a replication system, that is functional in a 
prokaryotlc or eukaryotic host cell.- Replication systems 
can be derived from ColEl, 2 mu plasmid, X, SV40, bovine 
papilloma virus, and the like. . 
, * Suitable vectors include those designed for 

propagation and expansion, or for expression, or both.. A 
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preferred cloning vector is selected from the group 
consisting of the pUC series, the pBluescript series 
(Stratagene, LaJolla, CA) ,. the pET series (Novagen, 
Madison, WI) , the pGEX series (Pharmacia Biotech, 
5 Uppsala, Sweden) , and the pEX series (Clonetech, Palo 
• Alto, CA) . Examples of animal expression vectors include 
pEUK-Cl, pMAM and pMAMneo (Clonetech, Palo Alto, CA) . 

An expression vector can comprise a native or 
normative promoter operably linked to a nucleic acid 
10 molecule encoding an above-described polypeptide. The 
. selection of promoters, e.g., strong, weak, inducible, 
tissue-specific and developmental-specific, is within the 
skill in the art . Similarly* the combining of a nucleic 
acid molecule as described above with a, promoter is also 
15 within the skill in the art. 

The skilled artisan will also recognize that the 
polypeptide has ability to bind the gpl20 protein, which 
is most often found outside of cells. Accordingly, the 
present inventive nucleic acid advantageously can 
20 comprise a nucleic acid sequence that encodes a signal 
sequence such that a signal sequence is translated as a 
fusion protein with the polypeptide of the present 
inventive polypeptide to form a signal sequence- 
polypeptide fusion. The signal sequence can cause 
25 secretion of the entire polypeptide, including the signal 
sequence (which is a pharmaceutical^ acceptable 
substituent) , or can be cleaved from the polypeptide 
(i.e., the polypeptide of the compound) prior to, or 
during, secretion so that at least the present inventive 
30 polypeptide is secreted out 6f a cell in which the 
nucleic acid is expressed. 



17 

Alternatively, the nucleic acid comprises or encodes 
an antisense nucleic acid molecule or a ribozyme that is 
specific for a specified amino acid sequence of an above- 
described polypeptide. A nucleic acid sequence 
introduced in antisense suppression generally is 
substantially identical to at least a portion of the 
endogenous gene or gene to be repressed, but need not be 
identical. Thus, the vectors can be designed such that 
the inhibitory effect applies to other proteins within a 
family of genes exhibiting homology or substantial 
homology to the target gene . The introduced sequence 
also need not be full-length relative to either of the 
primary transcription product or the fully processed 
mRNA. Generally, higher homology can be used to 
compensate for the use of a shorter sequence. 
Furthermore, the introduced sequence need not have the 
same intron or exon pattern, and homology of non-coding 
segments will be equally effective. 

Ribozymes also have been reported to have use as a 
means to inhibit expression of endogenous genes. It is 
possible to design ribozymes that specifically pair with 
virtually any target RNA and cleave the phosphodiester 
backbone at a specific location, thereby functionally 
inactivating the target RNA. In carrying out this 
cleavage, the ribozyme is not itself altered and is, 
thus, capable of recycling and cleaving other molecules, 
making it a true enzyme. The inclusion of ribozyme 
sequences within antisense RNAs confers RNA-cleaving 
activity upon them, thereby increasing the activity of 
the constructs. The design and use of target RNA- 
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specific ribozymes is described in Haselof f et al . , 
Nature 334: 585-591 (1988). 

Further provided by the present invention is a 
composition comprising an above -described polypeptide or 
nucleic acid and a carrier therefor. Another composition 
provided by the present invention is a composition 
comprising an antibody to an above -de scribed polypeptide, 
or an anti-antibody to an above -described polypeptide. 

Any embodiment of the present invention including 
the present inventive polypeptide, nucleic acid, 
antibody, and ant i- antibody, can be incorporated into a 
composition comprising a carrier. The carrier can serve 
any function. For example, the carrier can increase the 
solubility of the present inventive polypeptide, nucleic 
acid or antibody in aqueous solutions. Additionally, the 
carrier can protect, the present inventive polypeptide, 
nucleic acid or antibody from environmental insults, such 
as dehydration, oxidation, and photolysis. Moreover, the 
carrier can serve as an adjuvant, or as a timed-release 
control means in a biological system. 

Antibodies can be generated in accordance with 
methods known in the art. See,, for example, Benjamin, In 
Immunology: a short course, Wiley-Lies, NY, 1996, pp. 
436-437; Kuby, In Immunology, 3rd. ed. , Freeman, NY, 
1997, pp. 455-456; Greenspan et al . , FASEB J. 7: 437-443 
(1993); and Poskitt, Vaccine 9: 792-796 (1991). Anti- - 
antibodies (i.e., anti- idiotypic- antibodies) also can be 
generated in accordance with methods known in the art 
(see, for example, Benjamin, In Immunology: a short 
course, Wiley-Liss, NY, 1996,. pp. 436-437; Kuby, In 
Immunology, 3rd. ed., Freeman, NY, 1997, pp. 455-456; 
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t wac-TTTi j 7, 437-443, 1993; Poskitt, 
Greenspan et al., faseb /, * J ' 

vaccine . 9, 792-796, 1991; and Madiyalakan et al., 
Hybridonor 14 : 199-203 (1995) ( -Anti -idiotype induction 
therapy") ) . Such antibodies can be obtained and employed 
either in solution-phase or coupled to a desired solid- 
phase matrix. Having in hand such antibodies, one 
skilled in the art will further appreciate that such 
antibodies, using well-established procedures (e.g., such 
as described by Harlow and Lane (1988, supra) ,, are useful 
in the detection, quantification, or purification of 
gpl20 or HIV, particularly HIV-1, conjugates of each and 
host cells transformed to produce a gpl20 receptor or a 
derivative thereof. Such antibodies are also useful in a 
method of prevention or treatment of a viral infection 
and in a method of inducing an immune response to HIV as 

provided herein . 

In view of the above, an above -described polypeptide 
can be administered to an animal. The animal generates 
anti-polypeptide antibodies. Among the ant i -polypeptide 
antibodies generated or induced in the animal are 
antibodies that have an internal image of gpl20. In 
accordance with well-known methods, polyclonal or 
monoclonal antibodies can be obtained, isolated and 
. selected. Selection of an anti-polypeptide antibody that 
has an internal image of gpl20 can be based upon 
competition between the anti-polypeptide antibody and 
gpl20 for binding to an above-described polypeptide, or 
upon the ability of the anti-polypeptide antibody to bxnd 
" to a free polypeptide as opposed to a polypeptide bound 
to gpl20. Such an anti-antibody can be administered to 
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an animal to prevent or treat an HIV infection in 
accordance with methods provided herein. 

Although nonhuman ant i- idiotypic antibodies, such as 
an anti-polypeptide antibody that, has an internal image 
of gpl20 and, therefore, is anti-idiotypic to gpl20, are 
useful for prophylaxis in humans, their favorable 
properties might, in certain instances, can be further 
enhanced and/or their adverse properties further 
diminished, through "humanization" strategies, such as 
those recently reviewed by Vaughan, Mature Biotech. , 16, 

535-539, 1998. 

Prior to administration to an animal, such as a 
mammal, in particular a human, an above -de scribed 
polypeptide, nucleic acid, antibody or anti-antibody can 
be formulated into various compositions by combination 
with appropriate carriers, in particular, 

pharmaceutical^ acceptable carriers or diluents, and can 
be formulated to be appropriate for either human or 
veterinary applications. 

The present invention also provides a method of 
making an antibody. The method comprises administering 
an immunogenic amount o£ an above-described polypeptide 
or nucleic acid to an animal, such as a mammal, in 
particular a human. Determining the quantity of a 
polypeptide or nucleic acid that is immunogenic wxll 
depend in part on the degree of similarity to a protein 
or other molecule of the inoculated, animal, the route of 
administration of the polypeptide or nucleic acid, and 
the size of the polypeptide administered or encoded by 
, the administered nucleic acid. If necessary, the 

polypeptide or nucleic acid can be mixed with or ligated 
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to a substance (or an adjuvant) that enhances its 
immunogenicity. Such calculations and procedures are 
within the skill of the ordinary artisan. Additionally, 
the present inventive method preferably can be used to 
5 induce an immune response against HIV, particularly 
HIV-1, in a mammal, particularly a human. 

In view of the above, the present invention further 
provides a method of prophylactically or therapeutically 
treating an HIV infection in a mammal, particularly a 
10 human, in need thereof . The method comprises 

administering to the mammal an HIV replication- inhibiting 
effective amount of an above -described polypeptide, 
nucleic acid, or an anti-antibody to an above -described 
polypeptide or a nucleic acid encoding such a 

15 polypeptide. 

The present invention also provides a method of 
prophylactically or therapeutically treating HIV 
infection in a mammal. The method comprises 
administering. to the mammal an effective amount of an 
20 above-described polypeptide or nucleic acid. Prior to 
administration to an. animal, such as a mammal, in 
• particular a human, an above -described polypeptide or 
nucleic acid can be formulated into various compositions 
by combination with. appropriate carriers, in particular, 
25 pharmaceutical^ acceptable carriers or diluents, and can 
be formulated to be appropriate for either human or 
veterinary applications. 

Thus, a composition for use in the method of the 
present invention can comprise one or more of the 
30 polypeptides, nucleic acids, antibodies or anti- 
antibodies described herein, preferably in combination 
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with a pharmaceutical^ acceptable carrier. 
Pharmaceutical^ acceptable carriers are well-known to 
those skilled in the art, as are suitable methods of 
administration. The choice of carrier will be 
determined, in part, by whether a polypeptide or a 
nucleic acid is to be administered, as well as by the 
particular method used to administer the composition. 
Optionally, the carrier can be selected to increase the 
solubility of the composition or mixture, e.g., a 
liposome or polysaccharide. One skilled in the art will 
also appreciate that various routes of administering a 
composition are available, and, although more than one 
route can be used for administration, a particular route 
can provide a more immediate and more effective reaction 
than another route. Accordingly, there are a wide 
variety of suitable formulations of compositions that can 
be used in the present inventive methods. 

A composition in accordance with the present 
invention, alone or in further combination with one or 
more other active agents, can be made into a formulation 
suitable for parenteral administration, preferably 
intraperitoneal administration. Such a formulation can 
include aqueous and nonaqueous, isotonic sterile 
injection solutions, which can contain antioxidants, 
buffers, bacteriostats, and solutes that render the 
formulation isotonic with the blood of .the intended 
recipient, and aqueous and nonaqueous sterile suspensions 
that can include suspending agents, solubilizers, 
thickening agents,, stabilizers, and preservatives. The 
formulations can be presented in unit dose or multi-dose 
sealed containers, such as ampules and vials, and can be 
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stored in a freeze-dried (lyophilized) . condition 
requiring only the addition of the sterile liquid 
carrier, for example/ water, for injections, immediately 
prior to use. Extemporaneously injectable solutions and 
suspensions can be prepared from sterile powders, 
granules, and tablets, as described herein. 

A formulation suitable for oral administration can 
consist of liquid solutions, such as an effective amount 
of the compound dissolved in diluents, such as water, 
saline, or fruit juice; capsules, sachets or tablets, 
each containing a predetermined amount of the active 
ingredient, as solid or granules; solutions or 
suspensions in an aqueous liquid; and oil-in-water 
emulsions or water-in-oil emulsions. Tablet forms can 
include one or more of lactose, mannitol, corn starch, 
potato starch, microcrystalline cellulose, acacia, 
gelatin, colloidal silicon dioxide, croscarmellose 
sodium, talc, magnesium stearate, stearic acid, and other 
excipients, colorants, diluents, buffering agents, 
moistening agents, preservatives, flavoring agents, and 
pharmacologically compatible carriers. 

Similarly, a formulation suitable for oral 
administration can include lozenge forms, which can 

comprise the active ingredient in a flavor, usually 

» 

sucrose and acacia or tragacanth; pastilles comprising 
the active ingredient in an inert base, such as gelatin 
and glycerin, or sucrose and acacia; and mouthwashes 
comprising the active ingredient in a suitable liquid 
carrier; as well as creams, emulsions, gels, and the like 
containing, in addition to the active ingredient, such 
carriers as are known in the art. 
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An aerosol formulation suitable for administration 
via inhalation also can be made. The aerosol formulation 
can be placed into a pressurized acceptable propellent , 
such as dichlorodifluoromethane, propane, nitrogen, and 

5 the like. 

A formulation suitable for topical application can 
be in the form of creams, ointments, or lotions. 

A formulation for rectal administration can be 
presented as a suppository with a suitable base 

• • r, -Pot- example, cocoa butter or a salicylate. A 
10 comprising, for example, 
g formulation suitable for vaginal administration can be 

O presented as a pessary, tampon, cream, gel. paste, foam. 

S or spray formula containing, in addition to the active 

OS . , . as are known in the art to be 

M. . ingredient, such carriers 

, 15 appropriate. . 

Q important' general considerations for design of 

delivery systems and compositions, and for routes of 
administration, for polypeptide drugs also apply 
' (Eppstein, .SSSSxi l T^rapeytl^run Carrier 

20 fiy^ 5. 99 -13 9 . «»i et al., ^^Z, 

" ^ . ^ Aft 15-50, 1988; Sanders, 

et al., T |n T ^.iti CS 48, IS 50. 

n - , ^^^ ^^^^ 95 " 1Q2 ' 199 °'" 
verhoef. SiE^-^ ^ P h ^oMnetiCS «. .3-93. 
. , 25 199 0> . The appropriate delivery system for a given ^ 
polypeptide will depend upon its particular nature the 
: particular clinical application, and the site of drug 

action. As with any protein drug, oral delivery will 
likely present special problems, due primarily to 
30 instability in the gastrointestinal tract and poor . 

absorption and hioavailability of intact, bioactive drug 
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therefrom. Therefore, especially in the case of oral 
delivery, but also possibly in conjunction with other 
routes of delivery, it will be necessary to use an 
absorption-enhancing agent in combination with a given 
5 polypeptide. A wide variety of absorption-enhancing 
agents have been investigated and/or applied in 
combination with protein drugs for oral delivery and for 
delivery by other routes (Verhoef , 1990, supra; van 
Hoogdalem, Pharmac . Ther . 44, 407-43, 1989; Davis, J\_ 
10 Pharm. Pharmacol. 44(Suppl. 1>, 186-90, 1992). Most 
commonly, typical enhancers fall into the general 
categories of (a) chelators, such as EDTA, salicylates, 
and N-acyl derivatives of collagen, (b) surfactants, such 
as lauryl sulfate and polyoxyethylene-9-lauryl ether, (c) 



a* 

yj is bile salts, such as glycholate and taurocholate, and 



derivatives, such as taurodihydrofusidate, (d) fatty 
acids, such as oleic acid and capric acid, and their 
derivatives , such , as acylcarnitines , monoglycerides , and 
diglycerides , (e) non-surfactants, such as unsaturated 

20 cyclic ureas, (f) saponins, (g) cyclodextrins , and (h) 
phospholipids. 

Other approaches to enhancing oral delivery, of 
protein drugs can include the aforementioned chemical 
modifications to enhance stability to gastrointestinal 

25 enzymes and/or increased lipophilicity . Alternatively, 
the protein drug can be administered in combination with 
other drugs or substances that directly inhibit proteases 
and/or other potential sources of enzymatic degradation 
of proteins. Yet another alternative approach to prevent 

30 or delay gastrointestinal absorption of protein drugs is 
to incorporate them into a delivery system that is 
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designed to protect the protein from contact with the 
proteolytic enzymes in the intestinal lumen and to 
release the intact protein only upon reaching an area 
favorable for its absorption. A more specific example of 
this strategy is the use of biodegradable microcapsules 
or microspheres, both to protect vulnerable drugs from 
degradation, as well as to effect a prolonged release of 
active drug (Deasy, in Micrroenc apsulation and Related 
Processes , Swarbrick, ed. , Marcell Dekker, Inc.: New 
York, 1984, pp. 1-60, 88-89, 208-11) . Microcapsules also 
can provide a useful way to effect a prolonged delivery 
of a protein drug after injection (Maulding, jL. 
fontrolleH Release 6. 167-76, 1987). 

The dose administered to an animal, such as a 
mammal, particularly a human, in the context of the 
present invention should be sufficient to effect a 
therapeutic or prophylactic response in the individual 
over a reasonable time frame. The dose will be 
determined by the particular polypeptide, nucleic acid, 
antibody, or anti-antibody administered, the severity of 
any existing disease state, as well as the body weight 
and age of the individual. The size of the dose also 
will be determined by the existence of any adverse side 
effects that may accompany the use of the particular 
polypeptide, nucleic acid, antibody or anti-antibody 
employed. It is always desirable, whenever possible, to 
keep adverse side effects to a minimum. 

The dosage can be in unit, dosage form, such as a 
tablet or capsule. The term "unit dosage form" as used 
herein refers to physically discrete units suitable as 
unitary dosages for human and animal subjects, each unit 
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containing a predetermined quantity of a vector, alone or 
in combination with other active agents, calculated in an 
amount sufficient to produce the desired effect in 
association with a pharmaceutical ly acceptable diluent, 
carrier, or vehicle. The specifications for the unit 
dosage forms of the present invention depend on the 
particular embodiment employed and the effect to be 
achieved, as well as the pharmacodynamics associated with 
each polypeptide, • nucleic acid or anti-antibody in the 
host. The dose administered should be an "HIV infection 
inhibiting amount" of an above -described polypeptide or 
nucleic acid or an "immune response- inducing effective 
amount" of an above -described polypeptide, an above- 
described nucleic acid, or an antibody as appropriate. 

Another composition provided by the present 
invention is a composition comprising a solid support- 
matrix to which is attached an above -described 
polypeptide, or an ant i- antibody to an above-described 
polypeptide. The solid matrix can comprise other 
functional reagents including, for example, polyethylene 
glycol, dextran, albumin and the like, whose intended 
effector functions may include one or more of the 
following: to improve stability of the conjugate; to 
increase the half -life of the conjugate; to increase 
resistance of the conjugate to proteolysis; to decrease 
the immunogenicity of the conjugate; to provide a means 
to attach or immobilize a functional polypeptide or anti- 
antibody onto a solid support matrix (e.g., see, for 
example, Harris, in Pol y (Ethylene Glycol) Chemistry: , 
RiotechnicpT Bio^ ical Applications, Harris, ed. , . 

Plenum Press: New York (1992) , pp. 1-14) . Conjugates 
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furthermore may comprise a polypeptide or anti- antibody 
coupled to an effector molecule, each of which, 
optionally, may have different functions (e.g., such as a 
toxin molecule (or an immunological reagent) and a 
5 polyethylene glycol (or dextran or albumin) molecule) . 
Diverse applications and uses of functional proteins and 
polypeptides, attached to or immobilized on a solid 
support matrix, are exemplified more specifically for 
poly(ethylene glycol) conjugated proteins or peptides in 
10 a review by Holmberg et al. (In Poly (.Ethylene Glycol) 
pyemia try: pi oh^r-hni nal and Biomedical Applications, 
Harris, ed . , Plenum Press: New York, 1992, pp. 303-324). 

In addition, the present invention provides a method 
of removing HIV from a bodily fluid of an animal. The 
15 method comprises extracorporeal ly contacting the bodily 
fluid of the animal with a solid- support matrix to which 
is attached an above -described polypeptide or an anti - 

y antibody to an above -described polypeptide. 

C? 

nj Alternatively, the bodily fluid can be contacted with the 

20 polypeptide or anti-antibody in solution and then the 

solution can be contacted with a solid support matrix to 
which is attached a means to remove the polypeptide or 
. anti-antibody to which is bound HIV gpl20 from the bodily 
fluid. 

25 Methods of attaching an herein-described 

polypeptide, or an anti -antibody to a solid support 
matrix are known in the art . "Attached" is used herein 
to refer to attachment to (or coupling to) and 
immobilization in or on a solid support matrix. See, for 

30 example, Harris, in Poly (Ethylene Glycol) Chemistry: 

Biotechnj^l and R -inmedical Applications. Harris, ed. , 
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Plenum Press: New York (1992), pp. 1-14) and 
international patent application WO 91/02714 (Saxinger) . 
Diverse applications and uses of functional polypeptides 
attached to or immobilized on a solid support matrix are 
exemplified more specifically for poly (ethylene glycol) 
conjugated proteins or peptides in a review by Holmberg 
et al. (In Pr»1 y ("Ethy l pna Glycol) Chemistry: Biotechnical 

Biomedj ^i Applications, Harris, ed., Plenum Press: 
New York, 1992, pp. 303-324). 

The present invention also provides a method of 
making an antibody that binds to gpl20 of HIV under 
physiological conditions. The method comprises labeling 
an embodiment of the present inventive compound to obtain 
a labeled compound. Labeling compounds are within the 
skill of the ordinary artisan. For example, the present 
inventive compound can be labeled with radioactive atom, 
such as 12S I in the same or a similar manner as was 
performed in the examples provided below. Alternatively, 
an enzyme, such as horseradish peroxidase, can be 
attached to or incorporated into the present inventive 
compound. Then by exposing a chromogenic or photogenic 
. compound to the compound, a signal indicative of the 
presence and quantity of the compound present can be 
generated. In another alternative, a polyhistidinyl 
moiety can be attached to, or incorporated into, the 
present inventive moiety so that the present inventive, 
compound will react with high affinity to transition 
metal ions such as nickel, copper, or zinc ions; this 
reaction can be used as the basis to quantify the amount 
of the present inventive compound present at a particular 
4 location. In yet another alternative, the present 
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inventive compound can be used as antigen to a standard 
antibody that specifically recognizes an antigenic 
epitope of the present inventive compound. As is well- 
known, the standard antibody can itself be labeled or 
used in conjunction with an additional antibody that is 
labeled with an enzyme, radioisotope, or other suitable 
means. The skilled artisan will recognize that there is 
a plethora of other suitable means and methods to label 
the present inventive compound. 

This present inventive method of making an antibody 
that binds to a gpl20 envelope protein of HIV further 
comprises providing a library of synthetic peptides. The 
library consists of a multiplicity of synthetically- 
produced polypeptides that are homologous, and preferably 
essentially identical (i.e., having the same primary 
amino acid residue sequence, ignoring blocking groups, 
phosphorylation of serinyl, threoninyl, and tyrosinyl 
residues, hydroxylation of prolinyl residues, and the 
like) or identical, to a continuous region of an HIV 
gpl20 envelope protein. The polypeptides of the library 
can be any suitable length." While larger regions allow 
faster scanning and tend to preserve non-linear epitopes; 
shorter length polypeptides allow more sensitive 
screening of the primary sequence of the gpl20 protein. 
However, polypeptides that are too short can lose 
essential secondary structure or cleave reactive sites 
into one or more pieces. Preferably, a mixture of short 
and long polypeptides are incorporated into the library, 
however, the library can consist of polypeptides of a 
single length (measured in amino acid residues). For the 
sake of convenience the library can be split into 



31 

multiple parts, and screened by parts. Typically, the 
polypeptides of the library will be between about 6 and 
about 45 amino acid residues in length. 

Typically, the library will comprise a series of 
polypeptides each having an identical sequence to that of 
gpl20 but having an amino -terminus a particular number of 
amino acids downstream of the amino -terminus of the prior 
polypeptide (see, examples section below) . The distance, 
measured in amino acid residues, is referred to as the 
offset. Preferably, libraries that are characterized by 
the existence of an offset, the offset is not greater 
than the product of length of the longest polypeptide 
measured in amino acid residues and 1.5, preferably 1.0, 
and more preferably 0.5. The library can be 
alternatively characterized by the existence of an offset 
not greater than 30, preferably 15, and more 

preferably 4. . 

Each polypeptide of the library is substantially 
isolated from every other polypeptide of said library and 
is located in a known position. For example, each 
polypeptide can be bound to a solid. support and that is 
in a vessel or that can be placed in a vessel. The 
vessel preferably enables each polypeptide to be covered 
in a liquid that does not contact any other 
oligonucleotide of the library. By way of example, each 
polypeptide can be bound to a bead that is placed in a 
vessel (or tube) or can be bound. to the well of a multi- 
well assay plate. Alternatively, an array of 
polypeptides can be fashioned, for example on a microchip 
device (as is presently used in some DNA sequencing 
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devices and methods) , and the entire array can be bathed 
in a single solution. 

Each polypeptide is then individually contacted with 
the labeled compound such that a portion of the labeled 
compound can bind with the polypeptide of the library. 
In this way, a bound population of each labeled compound 
of the present invention and an unbound population of the 
labeled compound is generated. The phrase individually 
contacted means that each polypeptide has the opportunity, 
to bind with the labeled compound and the quantity of 
labeled compound bound by each can be determined. 

The method then comprises removing substantially all 
of the unbound labeled compound from the position 
occupied by each polypeptide. That is, the solution 
comprising the labeled compound is separated from the 
polypeptides of the library and the bound population of 
the labeled compound. This can be done by any suitable 
method, e.g., by aspiration and one or more washing steps 
comprising adding a quantity of liquid sufficient to 
cover all the surfaces that were contacted by the labeled 
compound and aspirating aWay substantially all of the 

wash liquid. 

The amount of labeled compound that remains 
co-localized with each polypeptide of the library is then 
measured to determine the quantity of labeled compound 
bound by each polypeptide. The amount of the present 
inventive compound bound by each polypeptide can be 
directly evaluated to identify a portion of the HIV gpl20 
envelope protein that binds to an (HIV) -receptor selected 
from the group consisting of CCR5 , CXCR4, STRL33 , and 
CD4. This information is then used to identify and 
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provide an immunizing compound. The immunizing compound 
comprises a polypeptide comprising an amino acid sequence 
that is homologous to, or preferably is essentially 
identical to,, or identical to, the portion of the HIV-1 
gpl20 envelope protein that binds with CD4, CCR5, CXCR4, 
and/or STRL33. The immunizing protein can be provided by 
processing gpl20, e.g., proteolytically digesting gpl20 
that has been isolated from a preparation of HIV-1. 
Preferably, however, the immunizing compound is prepared 
synthetically, or by genetic engineering, or by a 
combination of genetic engineering and synthetic methods. 
The immunizing compound can comprise a pharmaceutical^ 
acceptable substituent, can be encoded by a nucleic acid 
that can be expressed in a cell, can be mixed with a 
carrier, and is an inventive aspect of the present 
invention. 

An immunogenic quantity of the immunizing compound 
is then inserted into an animal (e.g., a human, or a 
rodent, a canine, a feline, or a ruminant) in a manner 
consistent with the discussion of a method of raising an 
antibody to the present inventive compounds that are 
homologous to portions of CCR5, CXCR4, STRL33, and CD4, 
above. The insertion of the immunizing compound causes 
the inoculated animal to produce an antibody that binds 
with said portion of the HIV gpl20 envelope protein. 
Thus the present invention also provides an antibody that 
binds to an HIV gpl20 envelope protein, as well as an 
antigen binding protein comprising one or more 
complementarity determining regions of the antibody 
(e.g., a Fab, a Fab,., an Fv, a single-chain antibody, a 
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diabody, and humanized variants of all of the above, all 
of which are within the skill in the art) . 

The antibody or variant thereof is preferably useful 
in detecting or diagnosing the presence of HIV gpl20 
envelope protein, and thus HIV, in an animal. The 
antibody is also preferably prevents or attenuates 
infection of an animal exposed to HIV, to whom an 
effective quantity of the antibody or a variant thereof, 
has been administered or produced in response to 
inoculation with the immunizing compound. The antibody 
preferably also is useful in treating or preventing 
(i.e., inhibiting) HIV infection in an animal to whom a 
suitable dose has been administered or in which a 
suitable quantity of antibody has been produced. The 
antibody is also useful in the study of HIV infection of 
mammalian cells, the host range specificities of HIV 
infection, and preferably, the mechanism by which 
antibodies neutralize infectious viruses. 

EXAMPLES 

The following examples further illustrate the 
present invention but, of course, should not be construed 
as limiting the scope of the claimed invention in any 
way. 

Synthetic peptide arrays were constructed in 96-well 
microtiter plates in accordance with the method set forth 
in WO 91/02714 (Saxinger) , and used to test the binding 
of HIV-1^ envelope gpl20 that had been labeled with 
radioactive iodine (radio labeling by standard methods). 
After incubating the radiolabeled gpl20 in a well with 
each. synthetic peptide, a washing step was performed to 
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remove unbound label, and the relative level of 
radioactivity remaining in each well of the plate was 
evaluated to determine the relative affinity of each 
peptide for the gpl20. The synthesis of the peptides and 
the quantity of binding between the synthetic peptides 
and the gpl20 were found to be suitably reproducible, 
precise, and sensitive. Initial screening of the entire 
primary sequence of the chemokine and CD4 receptor 
molecules was taken 18 amino acid residues at a time. 

The authenticity of the binding signals generated by 
this technique has been repeatedly demonstrated by 
showing that antibodies to CCR5 and CXCR4 are able to 
inhibit the binding of radiolabeled gpl20 to the y 
polypeptides derived from CCR5 and CXCR4 that show a high 
affinity for binding with gpl20. Additionally, the 
accuracy of the binding assay used hereinbelow is 
demonstrated by Example 7. 

Example 1 

This example identifies segments of the CCR5 
co-receptor that bind with gpl20. 

The first column in the table below indicates the 
number of the amino acid in the wild- type CCR5 receptor. 
The second column explicitly- identifies the peptide 
sequence. The third column indicates the radioactive 
counts recorded in twenty minutes (i.e., the cpm x 20) 
after the background or non-specific counts had been 
subtracted. The fourth column contains an X in each row 
for which the listed polypeptide bound with high affinity 
to gpl20. The fifth and final column contains- an X in 
each row wherein the listed sequence binds with 
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substantial affinity but is weak in comparison to other 
samples, particularly . adjacent samples. 



SEQ SEG PEPTIDE 



Counts Peak 
per 20 1 Activity 

Average - 

background 



non-Peak 
activity 



empty (control) 

1--18 MDYQVSSPIYDINYYTSE 

5 - - 22 VSSPIYDINYYTSEPCQK 

9 - - 2 6 I YDINYYTSEPCQKINVK 

13-30 NYYTSE PCQKINVKQ I AA 

17-34 SEPCQKINVKQIAARLLP 

21-38 QKINVKQIAARLLPPLYS 

25-42 VKQIAARLLPPLYSLVFI 

29-46 AARLLPPLYSLVFIFGFV 

33-50 LPPLYSLVFIFGFVGNML 

37-54 YSLVFIPGFVGNMLVILI 

41-58 FIFGFVGNMLVILILINC 

45-62 FVGNMLV I L I L INCKRLK 

49-66 MLVILILINCKRLKSMTD 

53-70 LILINCKRLKSMTDIYLL 

57-74 NCKRLKSMTD I YLLNLAI . 

61-78 LKSMTDIYLLNLAISDLF 

65-82 TD I YLLNLAI SDLFFLLT 

69-86 LIiNIiAISDIiFFIXTVPFW 

73-90 AI SDLFFLIiTVPFWAHYA 

77-94 LFFIiLTVPFWAHYAAAQW 

81-98 LTVPFWAHYAAAQWDFGN 

85- FWAHYAAAQWDFGNTMCQ 

89-. YAAAQWDFGNTMCQLLTG 

9 3 - • QWDFGNTMCQLIiTGLYFI 

97 - GNTMCQIiLTGLYFIGFFS 

1 0 1 - CQLLTGLYFIGFFSGIFF 

1 0 5 - TGLYFIGFFSGI FFI ILL 

1 0 9 - FIGFFSGIFFI ILLTIDR 

113 - FSGIFFIILLTIDRYLAV 

117- FFI ILLTIDRYLAWHAV 

1 2 1 - LLTIDRYLAWHAVFALK 
1 2 5 - DRYLAWHAVFALKARTV 

12 9 - AWHAVFALKARTVTFGV 

13 3 - AVFALKARTVTFGWTSV 
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228 



20 



18 



33 



705 



347 



343 



62 



84 



25 



210 



38 



144 



41 



173 



306 



212 



494 



1019 



941 



489 



80 



76 



83 



77 



31 



62 



34 



63 
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1 'I *7 - 
XO / 


LKARTVTFGWTSVITWV 


74 


1 A1 - 
X^fc X 


TVTFGWTSVITWWAVF 


-25 


X*x O 


GWTSVITWWAVFASLP 


69 


Xfs 27 


SVITWWAVFASLPGI IF 


46 




WWAVFASLPGI I FTRSQ 


87 


15 / 


VFASLPGI I FTRSQKEGL 


54 


161- 


rpGi T FTRSGKEGLHYTC 


118 


165- 


T tttp «?OKEGIiHYTCS SHF 


98 


169- 


Gmr&czT .HVTCSSHFPYSQ 


304 


173- 


r«T uvTr^QWFPYSOYQFW 


301 


177- 


rr» r» e G UT7 o V Q n YD FWKNFO 


367 


181- 


HFri bU xUr wj\anx*s< x J - J * VJ - 


1008 


185- 


ft />v/^CT*Ttr\Tl?O r PT .TT T\7T TjG 
SQ x Qr WJ\JN J? y i xix\..l v a. xj\j 


1572 


189- 


FWKNF QTxjJxX V J. JjvjU v uxr 


40 


193- 


F QTXiKX V J. Xi *o la V i_i ±r xxu v i x 


45 


197- 


KI V IIjGDVljFijijyiyiv x<~ x 


65 


201- 


IiGL VIjPXjIjVMV iti ovjxxj 


180 


205- 


LPLLjVtIVXL. x o^X-Ui^x j_ixj 


68 


209- 


VMVI CYSGI JLjiN. 1 ixUK^xuN 


-8 


213- 


CY S GI LKTxxbK*wKi>i c* jves-k. 


70 


217- 


I LKTLxxRCRlNl hi jSJSJ^xr^t\ v 


19 


221- 


LLRCRNEKKI^KAVKXjxr 


102 


225- 


— - . ■ ityrmtTP 7VTT*PT TT?TTMT 

RNEKKicrtRAV Kialr x xirix 


23 


229- 


T yt~s t\ t 7-0 T TUTTMTWT7L 


36 


233- 


• TrnT Tt? p PTMT'\7VR , T.'PWAP 
AVRxjXr 1 XJYiX Vir iir inrvr 


62 


237- 


Xr TXJyiX Vic Xjc vnj-ixt xxn x v 


121 


241- 


Ml V i t ij* WAT I IN 1 V UJJJJiH 


214 


245- 


P TiH W Air X IN X V I ii 1 1 ii^ x x w-*- 1 


616 


249- 


TV TVCTNTTT7T .T .T ."NTTFOEFFGIj 
Ax x JN X V J u tXxPi x x s« 1 - >f w -*- 1 


1962 


253- 


ttit t t *Nrn70T«iFFGIiNNCS 

X V ill 1 1 UN X XT ^O* x wJj*'** 


2134 


257- 


t xT T T , Tr^TrT7"P T nT ,MNCS S SNR 


293 


261- 


nt? t?tmt ."NrNrnS S SNRUDOA 


63 


265- 


r*T ivTMr 1 Q Q c xn? TJZ) O AMOVT 


-31 


269- 


nee g-ntd T.nn AMO VTETLG 


90 


273 - 


ktpt .nOAMOVTETLGMTHC 


10 


0*7*7 — 


QAMQVTETLGMTHCCINP 


81 


281- 


VTETLGMTHCCINP I I YA 


15 


285- 


LGMTHCCINPI I YAFVGE 


282 


289- 


HCCINPI I YAFVGEKFRN 


200 


293- 


NPI I YAFVGEKFRNYLLV 


162 


297- 


YAFVGEKFRNYLLVFFQK 


596 


301- 


GEKFRNYLLVFFQKHIAK 


69 



38 



65 



76 



23 



64 
"53 



100 



84 



84 
"47 



305- RNYLLVFFQKHI AKRFCK 

309- LVFFQKHI AKRFCKCCS I 

3 13 . QKHIAKRFCKCCSIFQQE 

317 _ AKRFCKCCS I FQQEAFER 

321- CKCCSIFQQEAPERASSV 

325- S IFQQEAPERAS SVYTRS 

329- QEAPERAS SVYTRSTGEQ 

333 . ERAS SVYTRSTGEQEI SV 

3 3 7 _ SVYTRSTGEQEI SVGL 

Th ese data indicate that, in addition to polypeptide 
sequences derived from positions 1-18 of the CCR5 
reTeptor, the polypeptide seances 
QWDFGNTMCQIjIjTGLYFIGFFS, sqyqfwknfqtlkivilg, 
APYNIVXiLLNTFQEFFGLNNCS , and YAFVGEKFRNYLIjVFFQK comprise 

aa each which is capable of binding 
multiple subsequences, eacn 

to HIV-1 envelope gpl20. 



10 



15 



20 



F,yample 2. pxcra 
This example identifies segments of the CXCR4 

co-receptor that bind with gp!20 . 

the table below indicates the 
. The first column m the cacx 

-a ^ the wild- type CXCR4 receptor, 
number of the amino acid in the wile yp 

n • ^licitlv identifies the peptide 
The second column explicitly 

nre The th ird and fourth columns indicate the 
sequence. Tne 

^ rpmrded in twenty minutes (i.e., tne 
radioactive counts recorded w 

x 20) after the background or non-specific counts had 
cpm x 20) after. f ^ contains an X in each 

been subtracted. The rir 

row f or „hic h tfce Xisted polypeptide bound ^ ^ 

a«inity to gpUO. *he - d " nal COlU,m b . nd : " tl 

1 i sted sequence binds witn 
an X in each row wherein the listed s qu 

• , #*Mnifcv but is weak in comparison to other 
substantial affinity cue 

samples, particularly adjacent samples. 
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SEQ SEG 



PEPTIDE 



Q 
PS 



09 

5 s 
F5=a 

ss : 
: ; : 

fU 

□ 
ru 



Ma j or Minor 
Activity Activity- 
Peak Peak 





empty (control) 


412 


0 


1- 18 


MEGI S I YTSDNYTEEMGS 


3003 


2591 


5--22 


SIYTSDNYTEEMGSGDYD 


483 


71 


9--26 


SDNYTEEMGSGDYDSMKE 


455 


43 


13-30 


TEEMGSGDYDSMKEPCFR 


453 


41 


17-34 


GSGDYDSMKEPCFREENA 


384 


-28 


21-38 


YDSMKEPCFREENANFNK 


465 


53 


25-42 


KEPCFREENANFNKIFLP 


664 


252 


29-46 


FREENANFNKI FLPTIYS 


463 


51 


33-50 


NANFNKI FLPTIYS I IFL 


585 


173 


37-54 


NKIFLPTIYS I IFLTGIV 


550 


138 


41-58 


LPTI YS I I FLTGI VGNGL 


530 


118 


45-62 


YS I I FLTGI VGNGLVILV 


535 


123 


49-66 


FLTGIVGNGLVILVMGYQ 


658 


246 


53-70 


IVGNGLVILVMGYQKKLR 


650 


238 


57-74 


GLVILVMGYQKKLRSMTD 


569 


157 


61-78 


LVMGYQKKLRSMTDKYRli 


517 


105 


65-82 


YQKKLRSMTDKYRLHLSV 


511 


99 


69-86 


LRSMTDKYRLHIiSVADLL 


572 


160 


73-90 


TDKYRLHLSVADLLFVIT 


504 


92 


77-94 


RLHLSVADLLFVITLPFW 


548 


136 


81-98 


SVADLLFVI TLPFWAVDA 


665 


253 


85-102 


LLFVITLPFWAVDAVANW 


475 


63 


89-106 


ITLPFWAVDAVANWYFGN 


542 


130 


93-110 


FWAVDAVANWYFGNFLCK 


478 


66 


97-114 


DAVANWYFGNFLCKAVHV 


524 


112 


101-118 


NWYFGNFLCKAVHVIYTV 


508 


96 


105-122 


GNFLCKAVHVIYTVNLYS 


643 


231 


109-126 


CKAVHVIYTVNLYSSVLI 


- 655 


243 


113-130 


HVIYTVNLYSSVLILAFI 


530 


118 


117-134 


TVNIjYS S VL I IiAF I SLDR 


654 


242 


121-138 


YSSVLILAFISLDRYLAI ' 


569 


157 


125-142 


LILAFISLDRYLAIVHAT 


519 


107 


129-146 


FI SLDRYLAI VHATNSQR 


503 


91 


133-150 


DRYLAI VHATNS QRPRKL 


580 


168 


137-154 


AIVHATNSQRPRKLLAEK 


485 


73 


141-158 


ATNSQRPRKLLAEKWYV 


490 


78 


145-162 


QRPRKLLAEKWYVGVWI 


539 


127 



X 



40 



m 



u 



149-166 KLLAEKVVYVGVWIPALL 
153-170 EKWYVGVWI PALLLTI P 
157-174 YVGVWI PALLLTI PDFIF 
161-178 WI PALLLTI PDFIFANVS 
165-182 LLLTI PDFI FANVSEADD 
169-186 I PDFI FANVSEADDRYIC 
173-190 I FANVSEADDRYI CDRFY 
177-194 VSEADDRYICDRFYPNDL 
181-198 DDRYI CDRFYPNDLWVW 
185-202 I CDRFY PNDLWVWFQFQ 
189-206 FYPNDLWWVFQFQHIMV 
193-210 DLWWVFQFQHIMVGLIL 
197-214 WFQFQHIMVGLILPGIV 
201-218 FQHIMVGLILPGIVILSC 
205-222 MVGLILPGIVILSCYCII 
209-226 ILPGIVILSCYCIIISKL 
213-230 IVILSCYCIIISKLSHSK 
217-234 SCYCIIISKLSHSKGHQK 
221-238 IIISKLSHSKGHQKRKAL 
225-242 KLSHSKGHQKRKALKTTV 
229-246 SKGHQKRKALKTTVILIL 
233-250 QKRKALKTTVTLILAFFA 
237-254 ALKTTVILILAFFACWLP 
241-258 TVILILAFFACWLPYYIG 
245-262 ILAFFACWLPYYIGISID 
249.-266 FACWLPYYIGISIDSFIL 
253-270 LPYYIGISIDSFILLEII 
257-274 IGISIDSFILLEIIKQGC 
261-278 IDSFILLE I IKQGCEFEN 
265-282 ILLE I IKQGCEFENTVHK 
269-286 I IKQGCEFENTVHKWI S I 
273-290 GCEFENTVHKWI S ITEAL 
277-294 ENTVHKWI S ITEALAFFH 
. 2 8 1 - 2 9 8 HKWI S ITEALAFFHCCLN . 
285-302 SITEALAFFHCCLNPILY 
289-306 ALAFFHCCLNPILYAFLG 
293-310 FHCCLNP I L YAFLGAKF K 
297-314 LNP I LYAFLGAKFKTS AQ 
301-318 LYAFLGAKFKTS AQHALT 
305-322 LGAKFKTSAQHALTSVSR 
309-326 FKTSAQHALTSVSRGSSL 
313-330 AQHALTSVSRGSSLKILS 



501 


89 


559 


147 


536 


124 


594 


182 


1418 


1006 


850 


438 


679 


267 


569 


157 


537 


125 


718 


306 


828 


416 


834 


422 


1001 

l-ww 1 


589 

www 


582 

W WA> 


170 

1 f w 


579 

w t w 


167 

1 w • 


604 

| WW*T 


192 

1 w&_ 


689 

Vj W W 


277 


671 


259 

W W 




157 

1 w / 


542 


130 

1 WW 




140 


6Q5 


283 

^ w w 


67^ 


261 


735 


323 

W^» W 


596 

www 


184 

1 w~ 


614 

w 1 ^ 


202 


851 


439 


1146 


734 


3884 


3472 


529 


117 


518 


106 


676 


264 


727 


315 


575 


163 


600 


188 


593 


181 


535 


123 


686 


i 274 


56£ 


156 


612 


200 


58S 


> 173 


55$ 


) 147 



X 
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317-334 
321-338 
325-342 
329-346 
333-350 
337-352 



LTSVSRGSSLKILSKGKR 
SRGSSLKILSKGKRGGHS 
SLKILSKGKRGGHS SVST 
LS KGKRGGHS S VSTESES 
KRGGHSSVSTESESSSFH 
HSSVSTESESSSFHSS 



595 


183 


581 


169 


697 


285 


597 


185 


579 


167 


515 


103 



These data indicate that, in addition to polypeptide 
sequences derived from positions 1-18 of the CXCR4 
receptor, the polypeptide sequences LLLTIPDFIFANVSEADD 
(165-182) , WFQFQHIMVGLILPGIV (197-214) , and 
IDS F I LLE 1 1 KQGCE FEN (261-278) comprise multiple 
subsequences, which is capable of binding to HIV-1 
envelope gpl20 . 



10 Example 3 

This example identifies segments of the STRL33 

co-receptor that bind with gpl20. 

The first column in the table below indicates the 
number of the amino acid in the wild-type STRL33 
15 receptor. The second column explicitly identifies the 

peptide sequence. . The third and fourth columns indicate 
the radioactive counts recorded in twenty minutes (i.e., 
the cpm x 20) after the background or non-specific counts 
had been subtracted. The. fifth column contains an X in 
20 each row for which the listed polypeptide bound with high 
affinity to gpl20. The sixth and final column contains 
an X in each row wherein the listed sequence binds with 
substantial affinity but is weak in comparison to other 
samples, particularly adjacent samples. 
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SEP SEG PEPTIDE 



Ma j or Minor 
Act i vi ty Ac t i vi ty 



Peak 



Peak 



1--18 

5--22 

9--26 

13-30 

17-34 

21-38 

25-42 

29-46 

33-50 

37-54 

41-58 

45-62 
49-66 
53-70 
57-74 
61-78 
65-82 
69-86 
73-90 
77-94 
81-98 

.85-102 
89-106 
93-110 
97-114 
101-118 
105-122 
109-126 
113-130 
117-134 
121-138 
125-142 
129-146 
133-150 
137-154 
141-158 
145-162 



empty (control) 

MAEHD YHEDYGFS S FNDS 

DYHEDYGFSSFNDSSQEE 

DYGFS S FNDS SQEEHQAF 

SSFNDSSQEEHQAFLQFS 

DSSQEEHQAFLQFSKVFL 

EEHQAFLQFSKVFLPCMY 

AFLQFSKVFLPCMYIiWF 

F S KVFL P CMYLWFV CGL 

FLPCMYIiWFVCGLVGNS 

MYLWFVCGLVGNSLVLV 

VFVCGLVGNSLVLVI S I F 

GLVGNSLVLVI S I FYHKL v 

NSLVLVI S I FYHKLQS LT 

LVI S I FYHKLQSLTDVFL 

IFYHKLQSLTDVFLVNLP 

KLQSLTDVFLVNLPIiADL 

LTDVFLVNIiPIiADLVFVC 

FliVNLPIiADLVFVCTLPF 

liPLADIiVFVCTLPFWAYA 

DLiVFVCTIjP FWAYAGIHE 

VCTLP FWAYAGI HEWVFG 

PFWAYAGIHEWVFGQVMC 

YAGIHEWVFGQVMCKSLIi 

HEWVFGQVMCKSLLGIYT 

FGQVMCKSIiLGIYTINFY 

MCKSLLGIYTINFYTSML 

LliGIYTINFYTSMIilliTC 

YTINFYTSMLILTCITVD 

FYTSMLjILTCITVDRFIV. 

MLILTCITVDRFIWVKA 

TCI TVDRF I VWKATKAY 

VDRF I VWKATKAYNQQA 

IWVKATKAYNQQAKRMT 

KATKAYNQQAKRMTWGKV 

AYNQQAKRMTWGKVTSLL 

QAKRMTWGKVTSLLIWVI 

MTWGKVTSLLIWVISLLV 



-34,5 


34.5 


1178 5 

lit W.w 


1320.5 






QC7Q C 

oo/y.o 




2689.0 


4.(01 .0 


869.5 


Z1OZ.0 


231 6.5 


1819.5 


1421.5 


1359.5 


534.5 


633.5 


605.5 


372.5 


168.5 


235.5 


570.5 


284.5 


164.5 


95.5 


1255.5 


1378.5 


1620.5 


1780.5 


1275.5 


1256.5 


412.5 


348.5 


233.5 


336.5 


70.5 


51.5 


557.5 


960.5 


1116.5 


1 063.5 


1819.5 


A ~7C A e 

1754.5 


7262.5 


7C17 e 

Too ( .0 


591 1 .5 




ooyi .o 




1 £0 1 .5J 






1283 5 


499.5 


408.5 


351.5 


510.5 


744.5 


907.5 


298.5 


228.5 


89.5 


> 346.5 


103.5 


i 53.5 


166.5 


> 43.5 


701 .£ 


> 568.5 


55.f 


> 4.5 


-71 .£ 


5 -31.5 


-0.! 


5 -26.5 



X 
X 

X 
X 
X 
X 



X 
X 



X 
X 
X 

X 
X 
X 
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149- 


166 


KVTSLLIWVI SLLVSLPQ 


-39.5 


-118.5 


153 - 


170 


LLIWVISLLVSLPQIIYG 


42.5 


75.5 


157- 


174 


VISLLVSLPQIIYGNVFN 


-60.5 


-127.5 


161- 

JL. \J JL. 


178 


LVSLPQI IYGNVFNLDKL 


91.5 


-15.5 


165- 

JL VJ 


182 


PQI IYGNVFNLDKLICGY 


-18.5 


-37.5 


ID J 


186 


YGNVFNLDKLI CGYHDEA 


-41.5 


-20.5 


X / J — 


X _7 U 


FNLDKIilCGYHDEAISTV 


1072.5 


1078.5 


1 *7*7 - 

X / / — 


1 QA 
Xl?** 


KLI CGYHDEAI STWLAT 


1363.5 


1604.5 






GYHDEAI STWLATQMTXi 


754.5 


1181.5 


XOD 


/Uz 


EAI STWLATQMTLGFFXj 


3973.5 


3745.5 


189- 


206 


TWTjATOMTLGFFLPLLT 


2327.5 


2389.5 


193 - 


-210 


7v TOMTT ,GF FL PLLTM I VC 

X ^1*1 X XJVJx J» xj x .ii i ■ ' * * * * w 


2365.5 


2444.5 


197- 


•214 


tt .n^PTiPtiljTMI VCYSVI 


2387.5 


479.5 


201- 


•218 


T?T . PT .T ,TM T VC YS VI I KTIj 


1270.5 


1195.5 


205- 


222 


Xl X I v ii V X O V XlAi * JJJ **** W 


2787.5 


2654.5 


209- 


•226 


TfPVQVT TTTTT.TjHAGGFOK 


1334.5 


1143.5 


213- 


•230 


V X X Jxx XiXiJtlxiVjvar 


961.5 


682.5 


217- 


•234 


JL 1 1 1 iH ALajjl? ^PLXxXs-OxJfN. J- JL J. 


1041.5 


999.5 


221- 


-238 


AsjUr S< JXlxXVO XJI\.X X XT XJ V l ir* 


340.5 


260.5 


225- 


-242 


QKxiK.o Xi JxX irLiv ritx v x* xjj-i 


810.5 


814.5 


229- 


■246 


cr VTTT7T TTMZWT'TjTjTOMP 
oXiJxX X r Xi Vl Y Lr* V r xxux 1J - 


612.5 


853.5 


233- 


-250 


X Jt* Xi ViYLriV 1? Xixj X ^1'ixrx' anxxi. a 


386.5 


772.5 


237- 


-254 


[vl A v j? I i r i x ^l T lJrJr xmxji jxvx ^x> 


2263.5 


2842.5 


241- 


-258 


t t T'nM'DTTT^aT.MTCFTRSTHW 


2513.5 


3154.5 


245- 


-262 


M-DTMSTT .MTTT? T RSTHWEYYA 


2171.5 


2182.5 


24 9- 


-266 


T MTTTTTPQTWWEYYAMTSF 


934.5 


949.5 


253 * 


-270 


TP QTTTOEYYAMTSFHYTI 


1571.5 


1807.5 


257- 


-274 


TTWPWAMT S FHYT IMVTE 


2040.5 


3065.5 


261- 


-278 


vaMTS FHYT IMVTEAI AY 

X Al l X 1^ XT XI X X X A 4 » X 1 m 


2688.5 


2359.5 


o ^ c 
265 • 


o o o 
-282 


c FHYT IMVTE AIAYLRAC 


761.5 


1033.5 


O XT O 

2 6 9 


O Q /* 


TIMVTEAIAYLRACLNPV 


140.5 


272.5 


273 


O Q O 

-290 


TEAIAYLRACIiNPVLYAF 


604.5 


480.5 


2 / / 


O Q A 


AYLRACLNPVLYAFVSLK 


1802.5 


1849.5 




.OOP 


ACLNPVLYAFVSLKFRKN 


4173.5 


4515.5 


O Q ^ 


— o uz 


PVLYAFVSLKFRKNFWKL 


1859.5 


2147.5 






AFVSLKFRKNFWKLVKD I 


808.5 


1040.5 


293 


-310 


LKFRl^FWKLVKDIGCLP 


920,5 


957.5 


297 


-314 


KNFWKLVKD I GCLP YLGV 


143.5 


82.5 


301 


-318 


KLVKDIGCLPYLGVSHQW 


-2.5 


27.5 


305 


-322 


D I GCL P YLGVSHQWKS S E 


17.5 


78.5 


309 


-326 


LPYLGVSHQWKS SEDNSK 


111.5 


122.5 


313 


-330 


GVSHQWKSSEDNSKTFSA ' 


208.5 


► 306.5 
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317 


-334 


QWKSSEDNSKTFSASHNV 


464.5 


533.5 


321 


-338 


SEDNSKTFSASHNVEATS 


524.5 


434.5 


325 


-342 


SKTFSASHNVEATSMFQL 


1524.5 


1239.5 



These data indicate that, in addition to polypeptide 
sequences derived from positions 9-26 of the STRL33 
receptor, the polypeptide sequences LVISIFYHKLQSLTDVFL 
5 (53-70) , PFWAYAGIHEWVFGQVMC (85-102) , EAISTWLATQMTLGFFL 
(185-202) , LTMIVCYSVI IKTLLHAG (205-222) , 
MAVFLLTQMPFNLMKF IRSTHW (237-258) , HWEYYAMTSFHYTIMVTE 
(257-274) , ACLNPVLYAFVSLKFRKN (281-298) and 
SKTFSASHNVEATSMFQL (325-342) comprise multiple 
10 subsequences, which is capable of binding to HIV-1 
envelope gpl20. 

Example 4 

This example identifies segments of the human CD4 

15 protein that bind with gpl20. 

The second column in the in the table below 
identifies the amino acid residue sequence of the 
polypeptide employed in the assay. The first column 
identifies the sequence coordinates of human CD4 that 

20 have an identical amino acid sequence. The third column 
indicates the number of radioactive decays (i.e., counts) 
that were counted, which is indicative of the affinity of 
the synthetic polypeptide for the gpl20 protein. In the 
table below, polypeptides retaining more than 4,000 

25 counts identify fragments that have a substantial 

capability to bind with gpl20. Polypeptides retaining 
more than 6,000 counts have more substantial binding 
affinity. Polypeptides retaining at least about 10,000 
counts have a substantial and strong capacity to bind to 
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gpl20. Of course, fragments corresponding to amino acid 
coordinates 101-121 and 106-126 have a substantial/ 
strong, and dominant capacity to bind to gp!20. 



JD-L 


I 1 V 


1 . 

X 


• 21 


MNRGVPFRHLLLVLQLALLPA 


3587 

www # 






O 


ZD 


PFRHLLLVLQLALLPAATQGK 


4356 

™TW WW 


L>± 


/ i\ 
\ J / 


1 1 . 
11* 


' jl 


LLVLQLALLPAATQGKKVVLG 


1785 

1 1 WW 


rr% -i 

EjI 


I 4; 


lb - 


■JO 


T,AL.T,PAATOGKKWLGKKGDT 


1 # WW 


Fl 


( 5) 


21- 


A 1 
-41 


a ATnnK'KVVTjGKKGDTVELTC 


. 1 ww^ 


Gl 


( 6) 


26- 


-46 


t\ tVv V JJwX\JE\VJL/ X V J— i JLJ X x ruJ^iv 


la IU 


HI 


( 7) 


31- 


-51 




I OO I 


A2 


( 8) 


36- 


-56 


. TT7T7T r r , r ,r P2VQOTrTrQTnT* , TTWTTWc; 
X V XtiXj 1 v*. X nO .1 W * *x V» l\XM o 


I / OxC 


B2 


( 9) 


41- 


-61 


w IaOU^-^^^x" £X W XNXV OXN S< X XV. X 


1717 


C2 


(10) 


46 - 


-66 


JSJ\,Oiyx* xl W XvW OiN 1 Xv. 1 XjorlM vj 


Z I OxC 


D2 


(11) 


51- 


-71 


r xi W i\JN ol\ ^ 1 JN. 1 xjvjIn yuo r Jjijx 




E2 


(12) 


56- 


-76 


OTVT^V T1/TT i^lTVT/*^/^ o "PT VTl'O G VT . 

S NQ 1 is. 1 JjvjJN y O r xj 1 J\\3 ir O IVJU 


1^G7 


F2 


(13) 


61- 


-81 


tt /"i"\'Tr~\/~i G "CT TVPDGTTT.'NTnD AH 
iLiGWUvjoI* J-i 1 xv^xrOiSJjlNlJrCf^lJ 


1 Aft7 


G2 


(14) 


66- 


-86 


Cjox* Jul JX^JrOJXxxWxJiXirlx^oJXJXox^ 


1 Axlxl 


H2 


(15) 


71- 


-91 


JXVJXT O XvXJlN X> J\JnXJO jcvi^o XJ V» XV^VJXM 


1Q19 

1 57 1 xC 


A3 


(16) 


76- 


-96 


| ilMUKAJJ OXCKOlJn JJy VJlN x xrXJl 1 


17^^ 
I / wO 


B3 


(17) 


81- 


-101 


xJoiXiXOxj»VxvyvjrJ.Nx JtrXJ x x jvlnj xjxvjl 


999x1 

xlx.x.'r 


C3 


(18) 


86- 


-106 


LlyilvywlNr XT XJ X X XVJ.N XJXVX IjUkjU X 


WX.W*T 


D3 


(19) 


91- 


-111 


WT7DT.T TTnNJT.'R p TF.DSDTYTC!EV 
£N J7 irxjl X XXJLN XiXN. x xaxvij xv x x x v^xj v 


1 1646 


E3 


(20) 


96 • 


-lib 


T TTNT .TT TTSlDflDTYI CEVEDOKE 
X XviAl J— ixvx ci i /wxy xxx i_ i v i 1 1 /yiu-i 


8439 

W"TW W 


F3 


(21) 


101 • 


■l*5l 


T TP n qnTY T CEVEDOKE EVOLL 


6803 


G3 


v22 ) 


lUb * 


- 1ZD 


TV T PEVEDOKEE VOLLVFGLT 


44965 


HJ 


/ o*5 ^ 

Uo y 


Til. 
Ill 


- 1J 1 


VEDQKEEVQLLVFGLTANSDT 


36249 






lit 
11 O 


- xo o 


EEVQLLVFGLTANSDTHLLQG 


14171 


X5fr 


f 9m 


JL x» X 


-141 


LVFGIjTANSDTHLLQGQSLTL 


3683 




\ 4* O J 


-L 4b O 


-146 


TANSDTHLLQGQSLTLTLESP 


6114 


D4 


(27) 


131 


-151 


THLLQGQSLTLTLES PPGSS P 


2552 


E4 


(28) 


136 


-156 


GQSLTLTLESPPGSSPSVQCR 


1538 


F4 


(29) 


141 


-161 


LTLESPPGSSPSVQCRSPRGK 


1476 


G4 


(30) 


146 


-166 


PPGSSPSVQCRSPRGKNIQGG 


1496 


H4 


(31) 


151 


-171 


P S VQCRS PRGKNI QGGKTLS V 


1400 


A5 


(32) 


156 


-176 


RS PRGKN I QGGKTLS VSQLEL 


2066 


B5 


(33) 


161 


-181 


KNIQGGKTLSVSQLELQDSGT 


• 3078 


C5 


(34) 


166 


-186 


GKTLSVSQLELQDSGTWTCTV 


2618 


D5 


(35) 


171 


-191 


VSQLELQDSGTWTCTVLQNQK 


3879 


E5 


(36) 


176 


-196 


LQDSGTWTCTVLQNQKKVEFK 


2456 


F5 


(37) 


181 


-201 


TWTCTVLQNQKKVEFKID IW 


4030 


G5 


(38) 


186 


-206 


VLQNQKKVEFKIDIVVLAFQK 


9737 


H5 


(39) 


191 


-211 


KKVEFKIDIVVIjAFQKASSIV 


6313 


A6 


(40) 


196 


-216 


KIDI WLAFQKASS IVYKKEG 


3681 
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B6 


[41) 


201 


-221 


VLAFQKAS S I VYKKEGEQVEF 


3566 


C6 


[42) 


206 


-226 


KASS IVYKKEGEQVEFSFPLA 


14347 


D6 


[43) 


211 


-231 


VYKKEGEQVEFSFPLAFTVEK 


14740 


E6 


[44) 


216 


-236 


GEQVEFSFPLAFTVEKLTGSG 


18549 


F6 


[45) 


221 


-241 


FSFPLAFTVEKLTGSGELWWQ 


9673 


66 


(46) 


226 


-246 


AFTVEKLTGSGELWWQAERAS 


3992 


H6 


(47) 


231 


-251 


KLTGSGELWWQAERASSSKSW 


1878 


A7 


(48) 


236 


-256 


GELWWQAERAS S S KSWI.TFDL 


2730 


B7 


(49) 


241 


-261 


QAERAS SSKSWI TFDLKNKEV 


2588 


C7 


(50) 


246 


-266 


SSSKSWITFDLKNKEVSVKRV 


1761 


D? 


(51) 


251 


-271 


WITFDLKNKEVSVKRVTQDPK 


2126 


E7 


(52) 


256 


-276 


LKNKEVSVKRVTQDPKLQMGK 


2288 


F7 


(53) 


261 


-281 


VSVKRVTQDPKLQMGKKLPLH 


1848 


G7 


(54) 


266 


-286 


VTQDPKLQMGKKLPLHLTLPQ 


2075 


H7 


(55) 


271 


-291 


KLQMGKKLPLHLTLPQALPQY 


1949 


A8 


(56) 


276 


-296 


KKLPLHLTLPQALPQYAGSGN 


1922 


B8 


(57) 


281 


-301 


HLTLPQALPQYAGSGlJLTIiAL 


2394 


C8 


(58) 


286 


-306 


QALPQYAGSGNLTLALEAKTG 


2364 


D8 


(59) 


291 


-311 


YAGSGNLTIiALEAKTGKLHQE 


1830 


E8 


(60) 


296 


-316 


NliTLALEAKTGKLHQEVNLW 


1676 


F8 


(61) 


301 


-321 


LEAKTGKLHQEVNLWMRATQ 


1729 


G8 


[62) 


306 


-326 


GKLHQEVNLWMRATQLQKNL 


1776 


H8 


(63) 


311 


-331 


EVNLWMRATQLQKNLTCEVW 


2183 


A9 


(64) 


316 


-336 


VMRATQLQKNLTCEVWGPTS P 


2144 


B9 


(65) 


321 


-341 


QLQKNLTCEVWGPTS PKLMLS 


1856 


C9 


(66) 


326 


-346 


LTCEVWGPTSPKLMLSLKLEN 


2412 


D9 


(67) 


331 


-351 


WGPTSPKLMLSLKLENKEAKV 


2414 


E9 


(68) 


336 


-356 


PKLMLSLKLENKEAKVSKREK 


1656 


F9 


(69) 


341 


-361 


SLKLENKEAKVSKREKAVWVL 


1663 


G9 


(70) 


346 


-366 


NKEAKVSKREKAVWVLNPEAG - 


1735 


H9 


(71) 


351 


-371 


VSKREKAVWVLNPEAGMWQCL 


2034 


A1.0 


(72) 


356 


-376 


KAVWVLNPEAGMWQCLLSDSG 


3133 


BIO 


(73) 


361 


-381 


LNPEAGMWQCLLSDSGQVLLE 


6316 


CIO 


(74) 


366 


-386 


GMWQCLLSDSGQVLLESNIKV 


4185 


DIO 


(75) 


371 


-391 


LLSDSGQVLLESNI KVLPTWS 


2375 


ElO 


(76) 


376 


-396 


GQVLLESNIKVLPTWSTPVQP 


2089 


FIO 


(77) 


381 


-401 


ESNI KVLPTWSTPVQPMAL IV 


1992 


GIO 


(78) 


386 


-406 


VLPTWSTPVQPMALIVLGGVA 


2197 


HIO 


(79) 


391 


-411 


STPVQPMALIVLGGVAGLLLF 


2527 


All 


(80) 


396 


-416 


PMAL I VLGGVAGLLLF IGLG I . 


3067 


Bll 


(81) 


401 


-421 


VLGGVAGLLLFIGLGIFFCVR 


. 3738 


Cll 


(82) 


406 


-426 


AGLLLF IGLGI FFCVRCRHRR 


2099 


Dll 


(83) 


. 411 


-431 


FIGLGIFFCVRCRHRRRQAER 


1900 


Ell 


(84) 


416 


-436 


I FFCVRCRHRRRQAERMSQ I K 


2085 


Fll 


(85) 


421 


-441 


RCRHRRRQAERMSQI KRLLSE 


2075 


Gil 


(86) 


426 


-446' 


RRQAERMSQ I KRLLSEKKTCQ 


1607 
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H1K87) • 431-451 RMSQIKRLLSEKKTCQCPHRF 2020 

A12(88) 436-456 KRLLSEKKTCQCPHRFQKTCS 1674 

B12(89) 441-458 EKKTCQCPHRFQKTCS P I 2006 

A! ( 0) empty (control) 2075 



Example 5 

This example shows the binding of ^I-HIV-l^ gpl20 
to. the amino termini of CCR5, CXCR4, and STRL33 as a 
function of the dependence on position and length. 
Synthetic peptide arrays of nonapeptides, dodecapep tides, 
pentadecapeptides and octadecapeptides derived from CCR5 
(panel A) , CXCR4 (panel B) and STRL33 (panel C) amino 
terminal domains were prepared and utilized to test the 
binding of ^I-HIV-i^ envelope gpl20. Ordinal sequence 
position numbers are given in accordance with the 
sequence data provided by the Genbank database for CCR5 

(accession No. gl457946, gi|l457946), CXCR4 (accession 
No. g539677, gi|400654, sp|P30991) andSTRL33 (accession 
No. g2209288, gi|2209288). The counts shown are the 
counts detected in each well minus the background counts 

(i.e., counts observed in the assay when no polypeptide 
was bound to the well of the 96-well assay plate) . 
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Panel A Peptide Sequence Scanning 
Windows 

CCR5 

(In each sequence row 9-, 
Initial 12-, 1 5-, 1 8-mers share the 
Sequence same initial starting point) 
# xxxxxxxxx 9 
xxxxxxxxxxxx 12 
xxxxxxxxxxxxxxx 15 
xxxxxxxxxxxxxxxxxx 18 

"I MDYQVSSPIYDINYYTSfc 

2 DYQVSSPIYDINYYTSEP 

3 YQVSSPIYDINYYTSEPC 

4 QVSSPIYDINYYTSEPCQ 

5 VSSPIYDINYYTSEPCQK 

6 SSPIYDINYYTSEPCQKI 

7 SPIYDINYYTSEPCQKIN 

8 PIYDINYYTSEPCQKINV 

9 IYDINYYTSEPCQKINVK 

10 YDINYYTSEPCQKINVKQ 

11 DINYYTSEPCQKINVKQI 

12 INYYTSEPCQKINVKQIA 

13 NYYTSEPCQKINVKQIAA 

1 4 YYTSEPCQKINVKQIAAR 

15 YTSEPCQKINVKQIA 

1 6 TSEPCQKINVKQIAA 

17 SEPCQKINVKQIAAR 

18 EPCQKINVKQIA 

19 PCQKINVKQIAA 

20 CQKINVKQIAAR 

21 QKINVKQIA 

22 KINVKQIAA 

23 INVKQIAAR 



Binding Results For Window Length 
(counts bound - background (no peptide)) 



9 

12 

15 

18 



543 


2682 


4976 


5880 


1552 


3089 


5401 


6363 


2533 


5305 


5415 


6119 


490 


1959 


4594 


5645 


509 


. 1629 


3280 


3521 


671 


1739 


3498 


3285 


1503 


3463 


4575 


3234. 


1186 


2285 


2682 


2036 


1359 


2702 


2516 


1261 


4379. 


5245 


3052 


1913 


1396 


1361 


1144 


712 


1384 


1190 


707 


684 


1548 


977 


760 


595 


1029 


1052 


847 


638 


567 


507 


459 




440 


427 


509 




434 


430 


426 




397 


432 






386 


385 






435 


581 






453 








487 








474 
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Panel B 



CXCR4 



Initial 



Peptide Sequence Scanning 
Windows 

(In each sequence row 9-, 12-, 15-, 18- 
mers share the same initial starting point) 



Binding Results For Window 

Length 

(counts bound - background) 



Sequence #^^^ 9"^ 

xxxxxxxxxxxx 12 
xxxxxxxxxxxxxxx 15 
xxxxxxxxxxxxxxxxxx 18 



12 



— 


1 
1 




0 






f*1 




P 




00 

Jz- 






7 






Cj 


Q 




10 




ll 


ru 
sj 


12 


13 


0 


14 


w 


15 




16 




17 




18 




19 




20 




21 




22 




23 




24 




25 




26 




27 




28 




29 




"a Not done 



' MEGISIYTSDNYTEEMGS 
EGISIYTSDNYTEEMGSG 
GISIYTSDNYTEEMGSGD 
ISIYTSDNYTEEMGSGDY 
SIYTSDNYTEEMGSGDYD 
IYTSDNYTEEMGSGDYDS 
YTSDNYTEEMGSGDYDSM 
TSDNYTEEMGSGDYDSMK 
SDNYTEEMGSGDYDSMKE 

DNYTEEMGS GD YD SMKEP 
MYTEEMGSGDYDSMKEPC 
YTEEMGSGDYDSMKEPCF 
TEEMGSGDYDSMKEPGFR 
EEMGSGDYDSMKEPCFRE 
EMGSGDYDSMKEPCFREE 
MGSGDYDSMKEPCFREEN 
GSGDYDSMKEPCFREENA 
SGDYDSMKEPCFREENAN 
GDYDSMKEPCFREENANF 
DYDSMKEPCFREENANFN 
YDSMKEPCFREENANFNK 
DSMKEPCFREENANFMKI 
SMKEPCFREENAKFN 
MKEPCFREENANFNK 
KEPCFREENANFNKI 
EPCFREENANFN 
PCFREENANFNK 
GFREENANFNKI 
REENANFNK 



15 



18 



591 


334 


3275 


2079 


a 


886 


7255 


1548 


454 


2644 


3274 


1217 


466 


3973 


2202 


861 


a 


288 


168 


239 


332 


335 


195 


173 


181 


161 


201 


103 


a 


54 


119 


38 


151 


149 


124 


161 


67 


121 


57 




a 


100 


30 


134 


68 


213 


70 


103 


146 


67 


23 


47 


a 


61 


121 


130 


64 


36 


69 


64 


57 


68 


64 


129 


a 


155 


172 


155 


100 


118 


186 


89 


53 


167 


i98 


134 


a 


167 


146 


75 


171 


144 


80 


89 


85 


144 


146 


40 


a 


119 


55 




188 


133 


74 




165 


105 


93 




a 


69 






104 


108 






103 


66 






58 
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Panel C Peptide Sequence Scanning 

Windows 

STRL33 

(In each sequence row 9-, 12-, 
Initial 15-, 1 8-mers share the same 
Sequenced initial starting point) 



Binding Results For Window Length 
(counts bound - background) 



xxxxxxxxx 9 
xxxxxxxxxxxx 12 
xxxxxxxxxxxxxxx 15 
xxxxxxxxxxxxxxxxxxl 8 



"l MAEHDYHEDYGFSSFNDS 

2 AEHDYHEDYGFSSFNDSS 

3 EHDYHEDYGFSSFNDSSQ 

4 HDYHEDYGFSSFNDSSQE 

5 DYHEDYGFSSFNDSSQEE 

6 YHEDYGFSS FNDSSQEEH 

7 HEDYGFSSFNDSSQEEHQ 

8 EDYGFSSFNDSSQEEHQA 

9 DYGFSSFNDSSQEEHQAF 

10 YGFS S FNDS SQEEHQAFL 

11 GFSSFNDSSQEEHQAFLQ 

12 FSSFNDSSQEEHQAFLQF 

13 SSFNDSSQEEHQAFLQFS 

14 SFNDSSQEEHQAFLQFSK 

15 FNDSSQEEHQAFLQFSKV 

16 NDSSQEEHQAFLQFS 

17 DSSQEEHQAFLQFSK 

18 SSQEEHQAFLQFSKV 

19 SQEEHQAFLQFS 

20 QEEHQAFLQFSK 

21 EEHQAFLQFSKV 

22 EHQAFLQFS 

23 HQAFLQFSK 

24 OAFLQFSKV 



12 



15 



18 



160 


625 


1239 


1386 


354 


697 


1095 


1014 


509 


937 


2235 


1219 


708 


1427 


1772 


1500 


851 


1554 


1240 


1191 


728 


1950 


1357 


985 


729 


1077 


947 


537 


953 


817 


1152 


548 


701 


573 


595 


440 


345 


745 


645 


1138 


171 


480 


270 


1639 


249 


. 403 


361 


3608 


243 


277 


902 


6038 


304 


303 


969 


4537 


246 


470 


4089 


4678 


180 


497 


6160 




147 


882 


4588 




287 


4455 


4732 




647 


7512 






1109 


5672 






6060 


5598 






7505 








2761 








2600 









Example 6 

This example shows ^I-HIV-l^ gpl20 binding to 
N- terminal peptide variants of CCR5 , CXCR4 and STRL33 . 
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Octadecapeptide alanine replacement variants of 
maximum gpl20 binding activity peaks were synthesized and 
tested for ^I-HIV-l^ gpl20 binding. Each binding value 
presented is the average of two separate synthesis and 
binding experiments. Relative percentage of Control - 
{[(mean counts /Control counts)] x 100%} ± average 
deviation. Background counts (no peptide, see Example 7) 
were subtracted from all values. Data for CCR5 are 
presented in Panel A; data for CXCR4 are presented in 
Panel B; and data for STRL33 are presented in Panel C. 



Panel A. ^I-fflV-luu gpl20 binding to N-terminal peptide variants of CCR5 
— C CR5 variant peptides (1-18) Relative % of Control" 



Control 
MIA 
D2A 
Y3A 
Q4A 
V5A 
S6A 
S7A 
P8A 
I9A 
Y10A 
D11A 
I12A 
N13A 
Y14A 
Y15A 
T16A 
S17A 
E18A 



MDYQVSSPIYDINYYTSfc: 
ADYQVSSPIYDINYYTSE 
MAYQVS S P I YD INYYTSE 
MDAQVS S P I YD INYYTSE 
MDYAVSSPIYD INYYTSE 
MDYQASSPIYDINYYTSE 
MDYQVASPIYD INYYTSE 

MDYQVSAP I YD INYYTSE 
MDYQVSSAI YD INYYTSE 
MDYQVSSPAYD INYYTSE 
MDYQVSSPIAD INYYTSE 
MDYQVSSPIYAINYYTSE 
MDYQVSS P I YDANYYTSE 
MDYQVSSPIYDIAYYTSE 
MDYQVSSP1YDINAYTSE 
MDYQVSSPIYDINYATSE 
MDYQVSSPIYDINYYASE 
MDYQVS S P I YD INYYTAE . 
MDYQVS S P I YD INYYTS A 



100 

167 ± 4 
125 ± 8 

51 ± 2 
104 ± 7 

82 ± 3 
124 ± 3 

56 ± 2 
157 ± 2 

24 ± 7 

19 ± 6 

63 ± 22 

14 ± 1 
253 ± 19 

15 ± 0.3 
21 ± 5 
78 ± 34 

64 ± 6 
4 ± 2 



_— — — , . uild-tvce peptide was defined as 100%. 

a The percent binding for the wild type p^" 
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Panel B ^I-HTV-Ilm gpl20 binding to N-terminal peptide variants of 

CXCR4 . - ■ ■ 

: CXCR4 variant peptides (1-18) Relative % of Control 4 

Control MEGISIYTSDNYTEEMGS 
MIA AEGI S I YTSDNYTEEMGS 

E2A MAGI S I YTSDNYTEEMGS 

G3A MEAI SI YTSDNYTEEMGS. 

I4A MEGAS IYTSDNYTEEMGS 

MEGIAI YTSDNYTEEMGS 
MEGI S AYTS DNYTEEMGS 
MEGISIATSDNYTEEMGS 
MEGI S I YASDNYTEEMGS 
S9A MEG IS I YTADNYTEEMGS 

D10A MEGI S I YTSANYTEEMGS 

MEGI S IYTSDAYTEEMGS 
MEGI S IYTSDNATEEMGS 
MEGI S I YTSDNYAEEMGS 
E14A MEGI S I YTSDNYTAEMGS 

E15A ' MEGI S I YTSDNYTEAMGS 
M16A MEGISIYTSDNYTEEAGS 
MEGI S I YTSDNYTEEMAS 
MEGI S lYTSDNYTEEMGA 



S5A 
I6A 
Y7A 
T8A 



N11A 
Y12A 
T13A 



G17A 
S18A 



100 




118 


± 18 


36 


± 0.3 


101 


± 3 


6 


± 0.3 


133 


± 5 


2 


± 1 


7 


± 0.4 


97 


±10 


70 


± 4 


71 


± 8 


38 


± 0.4 


28 


± 2 


70 


± 6 


72 


± 1 


. 56 


± 7 


88 


± 4 


68 


± 8 


79 


± 1 



The percent bindi ng for the wild-type peptide was defined as 100%. 
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Panel C -HW-U gP 120bind ta g«oN- tt r m inal peptide variants of 
STRL33 



Control 
E21A 
E22A 
H23A 
Q24A 
A25A 
F26A 
L27A 
Q28A 
F29A 
S30A 
K31A 
V32A 
F33A 
L34A 
P35A 
C36A 
M37A 
Y38A 



STRL33 variant peptides (2lj38) 

EEHQAFLQFSKVFLPCMX 
AEHQAFLQFSKVFLPCMY 
EAHQAFLQFSKVFLPCMY 
EEAQAFLQFSKVFLPCMY 
EEHAAFLQFSKVFLPCMY 
EEHQAFLQFSKVTLPCMY 
EEHQAALQFSKVFLPCMY 
EEHQAFAQFSKVFLPCMY 
EEHQAFIAFSKV'FLPCMy 
EEHQAFLQASKVFLPCMY 
EEHQAFLQFAKVFLPCMY 
EEHQAFLQFSAVFLPCMY 
EEHQAFLQFSKAFLPCMY 
EEHQAFLQFSKVALPCMY 
EEHQAFLQFSKVFAPCMY 
EEHQAFLQFSKVFIACMY 
EEHQAFLQFSKVFLPAMY 
EEHQAFLQFSKVFLPCAY 
EEHQAFLQFSKVFXjPCT^ 
-r- 7Z1 +h<» wild-tvue per 



100 
81 ± 2 
70 ± 1 
99 ± 1 
72 ± 1 

101 ± 1 
32 ± 0.1 
37 ± 2 

44 ± 0.4 
20 ± 1 
92 ± 2 

162 ± 2 
51 ± 3 

45 ± 2 

76 ± 1 

82 ± 3 
53 ± 5 

112. ± 4 

83 ± 2- 



10 



15 



looe protein to the polypeptides of the present 
gpl20 envelope pretax 

«-o the chemokine receptors from wnx 
invention and to. the one deriv ed or 

nHve polypeptides were originally derive 
present invent xve poiyp«f 

Lpirea is censed across the various spec.es^ 
HXV-X. This example aiso demonstrates that a step 
suhsesuent to initiai hindin, of „»• to CCR CXCK4. 

3 mi q the most likely source of the 
STKL33 , and CD4 xs the « Additionally , this 

phenomenon of host-range selectivity. A 

a , rateg th at the underlying method is 
example demonstrates tna pred icted to 

.v, a . receptor variants that are preui 
accurate in that recept 

* affinity for binding with gpl20, do m 
have an altered affinity t 



10 



15 



20 



25 



30 
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fact have a statistically similar alteration in affinity 
where comparable changes in the receptors have been 
identified in other work and the affinity for binding of 
gpl20/effect on infectivity has been measured. 

This example examines the effect of particular 
mutations of CCR5 that were studied in the work 
underlying the present invention and that were also 
studied by other artisans in the field. 

The following table identifies a mutation in the 
first column. The first letter designates the wild- type 
amino acid present at the position indicated by the 
number, and the letter A which terminates all entries in 
the first column indicates that the amino acid residue 
present in that position in the mutant polypeptide is 
alaninyl. For example, the first data row (i.e., the 
- second row of the table) contains the entry Y3A in the 
first column, which indicates that the tyrosine residue 
at position 3 of the wild- type CCR5 is substituted by an 

alanine residue. 

The second column provides the percentage of binding 
exhibited by a mutant polypeptide • compared to a wild- type 
polypeptide, when the methods used to. elucidate the 
present invention are used in conjunction with 
radiolabeled HIV-1^ gpl20 envelope protein. The third 
. through seventh columns provide similar data that have 
been extracted from the work of others in the field using 
a strain of HIV-1 virus indicated at the. top of each 
column. For example, row 2 of the following table 
indicates that when the mutation Y3A is effected in the 
. human CCR5 chemokine receptor, then the resulting CCR5 
polypeptide has 51.4% of the ability to bind HIV-1^ 



gpl20 envelope protein in comparison to an equivalent 
wild-type peptide. Similarly, HIV-1^ binds to the 
mutant polypeptide with 79% of the affinity of a 
non-mutated CCR5 chemokine receptor. 

5 ' 





gpl20 


YU2 


ADA 


JF-RL 


89.6 


DH123 


Y3A 


51.4 


n/a 


79 


82 


n/a 


42 


Q4A 


104 


85 


132 


111 


67 


105 


Y10A 


19 .2 


2 


50 


26 


10 


3 


D11A 


62.8 


2 


27 


22 


6 


3 


Y14A 


14.6 


12 


47 


25 


6 


0 


Y15A 


21 


30 


3 


3 


1 


0 


E18A 


4.1 


45 


1-2 


12 


3 • 


10 



Statistical analysis of these data indicates that 
the similarity between the binding affinity Of each 
mutant peptide for gpl20 elucidated in this study is not 

10 more than about 25% likely to be causally unrelated to 

the effects observed for YU2, and not more than about 4% 
likely to be causally unrelated to the effects observed 
for each of the other viruses listed in the table above. 
Additionally, the affinity. measurements generated by 

15 the underlying technique has been demonstrated to be 

accurate by (repetitively) showing that antibodies that 
specifically bind to radiolabeled gp.120 are capable of 
preventing the binding of gpl20 to polypeptides that have 
shown high affinity for binding with gp!20 in the 

20 - experiments upon which the present invention is 

predicated. Thus, this example shows that the binding 
with chemokine receptors HIV-1 can be inhibited by the 
present inventive polypeptides, irrespective of the 
strain of HIV-i from which the gpl20 protein is obtained. 



25 
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Example 8 

This example provides a characterization of the 
critical amino acids in the amino- terminal segments of 
CCR5, CXCR4, and STRL33 that are essential for the 
ability of these polypeptides to bind with gpl20. 

In this example, the effect on binding that occurs . 
to due successive replacement of each amino acid with 
alanine is indicated, wherein a (+) signifies a decrease 
in binding affinity and a (>) signifies an enhancement in 
binding affinity. As is clear from inspection, the 
sequences are shown with that amino -terminus at top and 
the carboxyl- terminus at bottom. 



CCR5 (1-18) 


(-5Cra4 (1-18) 


(21-38) 


M> 


M 


E 


D 


E+ 


E 


Y++ 


G 


H 


Q 


I+++++ 


Q 


V 


S> 


A 


S 


I++++++ 


F+++ 


S+ 


Y++++.+ 


L++ 


P>' 


T 


Q+ 


I+++ 




F+++ 


Y+++ 


D+ 


S 


D+ 


N++ 


K> 


I++++ 


Y++ 


V+ 


N> 


T 


F+ 


Y++++ 


E 


L 


Y+++ 


E++ 


P 


T 


M 


c+ 


S+ 


G 


M 


E+++++ 


S 


Y 
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Example 9 

This example employs the same technique as Example 4 
and provides information similar to that available from 
Example 4 . 

The data below compares the ability of synthetic 
fragments of CD4 to bind to labeled gpl20. .9-mer, 
12-mer, 15-mer, 18-mer, and 21-mers were selected based 
on the data from Examples 4, The relative binding 
affinities of each group of polypeptides can be 
determined by inspection of the number of counts of 
radiolabeled gpl2 0 that were retained by each N-mer. 
Data supporting these conclusions are provided by 
Examples 10 and 11. 



Peptide 

starting 
position # 


Active Peptides 


gp120 

bound 
(counts) 




ACTIVE 9-MERS 




105 


DTYICEVED 


1043 


115 


KEEVQLLVF 


1273 


116 


EEVQLLVFG 


3170 


117 


EVQLLVFGL 


2146 


217 


EQVEFSFPL 


1032 


218 


QVEFSFPLA 


1205 


219 


VEFSFPLAF 


1064 




ACTiVE 15-MERS 




109 


CEVEDQKEEVQLLVF 


1729 


110 


EVEDQKEEVQLLVFG 


2805 


111 


VEDQKEEVQLLVFGL 


3816 



Peptide 




Gp120 


starting 


Active Peptides 


Bound 


position # 




(counts) 




ACTIVE 12-MERS 




101 


I EDSDT Y I CEVE 


1107 


112 


EDQKEEVQLLVF 


1379 


113 


DQKEEVQLLVFG 


1624 


114 


QKEEVQLLVFGL 


1785 


115 


KEEVQLLVFGLT 


1774 


116 


EEVQLLVFGLTA 


3261 


117 


EVQLLVFGLTAN 


1838 


133 


LLQGQSLTLTLE 


1320 


215 


EGEQVEFSFPLA 


1456 


216 


GEQVEFSFPLAF 


1729 


217 


EQVEFSFPLAFT 


1556 


218 


QVEFS FPLAFTV 


1636 




ACTIVE 18-MERS 




105 


DTYI CE VEDQKEE 


1648 




VQLLV 




106 


TYICEVEDQKEEV 


3794 




QLLVF 




. 107 


Y I CE VEDQKEE VQ 


4611 
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112 EDQKEEVQLLVFGLT 

113 DQKEEVQLLVFGLTA 

114 QKEEVQLLVFGLTAN 

115 KEEVQLLVFGLTANS 

116 EEVQLLVFGLTANSD 

117 EVQLLVFGLTANSDT 



130 DTHLLQGQS LTLTLE 

131 THLLQGQSLTLTLES 

132 HLLQGQSLTLTLESP 

213 KKEGEQVEFS FPLAF 

214 KEGEQVEFSFPLAFT 

215 EGEQVEFSFPLAFTV 

216 GEQVEFSFPLAFTVE 

217 EQVEFSFPLAFTVEK 

218 QVEFS FPLAFTVEKL 



ACTIVE 21-MERS 



90 



91 



GNFPLI I KNLKIEDS 
DTYICE 

NFPLIIKNLKIEDSD 
TYICEV 



3633 
3905 
3770 
3485 
6423 
2689 



1622 
1874 
1277 

1921 
3253 
3270 
4656 
4135 
2047 



5248 
7803 





LLVFG 




108 


ICEVEDQKEEVQL 


3898 




LVFGL 




109 


CEVEDQKEEVQLL 


3797 




VFGLT 




110 


EVEDQKEEVQLLV 


3647 




FGLTA 




111 


VEDQKEEVQLLVF 


3913 








112 


EDQKEEVQLLVFG 


3416 




LTANS 




113 


DQKEEVQLLVFGL 


3317 




TANSD 




114 


QKEEVQLL VFGLT 


3671 




ANSDT 




127 


ANSDTHLLQGQSL 


1540 




TLTLE 




128 


NSDTHLLQGQSLT 


1726 




LTLES 




129 


SDTHLLQGQSLTL 


1260 




TLESP 




210 


IVYKKEGEQVEFS 


5382 




FPLAF 




211 


VYKKEGEQVEFSF 


4307 




PLAFT 




212 


YKKEGEQVEFSFP 


4839 




LAFTV 




213 


KKEGEQVEFSFPL 


4683 




AFTVE 




214 


KEGEQVEFSFPLA 


3117 




FTVEK 




215 


EGEQVEFSFPLAF 


2164 




TVEKL 




216 


GEQVEFSFPLAFT 


1643 




VEKLT 





92|FPLI IKNLKIEDSDT 

YICEVE 
93 PLIIKNLKIEDSDTY 

ICEVED 
94 LIIKNLKIEDSDTYI 

JcEVEDQ 
95 IIKNLKIEDSDTYIC 

EVEDQK 
96 IKNLKIEDSDTYICE 

VEDQKE 
97 KNLKIEDSDTYICEV 

EDQKEE 

99 LKIEDSDTYICEVED 

QKEEVQ 

100 KIEDSDTYICEVEDQ 

KEEVQL. 

1 01 IEDSDTYI CEVEDQK 

EEVQLL 

102 EDSDTYICEVEDQKE 

EVQLLV 

1 03 pSDTYI CEVEDQKEE 

VQLLVF 

104 SDTYICEVEDQKEEV 

QLLVFG 

105 DTYICEVEDQKEEVQ 

LLVFGL 

106 TYICEVEDQKEEVQL 

LVFGLT 

107 YICEVEDQKEEVQLL 

. JvFGLTA 

108 ICEVEDQKEEVQLLV 

FGLTAN 

109 CEVEDQKEEVQLLVF 

GLTANS 

123 FGLTANSDTHLLQGQ 

SLTLTL 

124 GLTANSDTHLLQGQS 

LTLTLE 

125 LTANSDTHLLQGQSL 

TLTLES 

126 TANSDTHLLQGQSLT 

LTLESP 
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13919 
20145 
17108 
11892 
15073 
8789 

5519 
6325 
12064 
4933 
30277 
30319 
25424 
20191 
22884 
7276 
3517 

11529 
14065 
17112 
23595 
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9382 


'I 




24959 


on*? / 


-NVTv o e T^rVTHCEGEOV 




iiP or it j-j 


30873 


ZUO 


cr a g G TT7VTTTTEGEOVE 




T?QT?DT.3V 


25146 


OIV7 

207 


AS S X V X JxJS-c»w2*w v a e 




or xrJuirvr 


28068 


208 


S S IVYKKEGEQVEFS 




FPLAFT 


8165 


209 


S IVYKKEGEQVEFS F 




PLAFTV 


15620 


210 


IVYKKEGEQVEFS FP 




LAFTVE 




221 


FS F PIiAFTVEKLi 1 <jo 


4163 




GELWWQ 


2284 




S F P LAFTVE KIjT(j ^ ^ 




ELVJWQA 


6276 


223 


FPLiAFTVEKLTGSGE 




LWWQAE 


2647 


224 


t PLAFTVEKLTGSGEL 




WWQAER 


3577 


225 


5 IAFTVEKLTGSGELW 




WQAERA 





Sample 10- 

This example provides data which enables those 
skilled in the art to arrive at the conclusions indicated 
in Examples 9 and 12. In this example, the counts of 
radiolabeled gp-120 retained by each peptide indicated in 
the left hand column are given in the right hand. column. 
The. first panel (panel A) provides data for 21-mers of 
CD4. 



Panel A 
PEPTIDE 



LWDQGNFPLI IKNLKIEDSDT 
WDQGNFPLI IKNLKIEDSDTY 
DQGNFPLI IKNLKIEDSDTYI 



COUNTS 

731 
889 
1138 
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QGNFPLI IKNLKIEDSDTYIC 
GNFPLIIKNLKIEDSDTYICE 
NFPLIIKNLKIEDSDTYICEV 

FPLI I KNLKI EDSDTYI CEVE 
PLI IKNLKIEDSDTYICEVED 
LIIKNLKIEDSDTYICEVEDQ 
I I KNLKI EDS DTY I CEVEDQK 
I KNLKI EDSDTYI CEVEDQKE 
KNLKIEDSDTYICEVEDQKEE 
NLKIEDSDTYICEVEDQKEEV 
LKIEDSDTYI CEVEDQKEEVQ 
KI EDSDTYI CEVEDQKEEVQL 
IEDSDTYIGEVEDQKEEVQLL 
EDSDTYI CEVEDQKEEVQLLV 
DSDTYI CEVEDQKEEVQLLVP 
SDTYI CEVEDQKEEVQLLVFG 
DTYICEVEDQKEEVQLLVFGL 
TYI CEVEDQKEEVQ LLVFGLT 
YICEVEDQKEEVQLLVFGLTA 
ICEVEDQKEEVQLLVFGLTAN 
CEVEDQKEEVQLLVFGLTANS 

evedqkeevqllvfgltansd 
vedqkeevqllvfgltansdt 
edqkeevqllvfgltansdth 
dqkeevqllvfgltansdthl 
qkeevqllvfgltansdthll 
keevqllvfglTansdthllq 
eevqllvfgltansdthllqg 

EVQLLVFGLTANSDTHLLQGQ 

vqllvfgltansdthllqgqs 
qllvfgltansdthllqgqsl 
llvfgltansdthllqgqslt 
lvfgltansdthllqgqsltl 
vfgltansdthllqgqsltlt 
fgltansdthllqgqsltltl 

GLTANSDTHLLQGQSLTLTLE 
LTANSDTHLLQGQSLTLTLES 

tj^nsdthllqgqsltltlesp 

Empty (Control) 
twtctvlqnqkkvefkidiw 

wtctvlqnqkkvefkidiwl 

tctvlqnqkkvefkidiwla 

ctvlqnqkkvefkidivvlaf 

tvlqnqkkvefkidiwlafq 

vlqnqkkvefkidiwlafqk 



2242 
5248 
7803 
13919 
20145 
17108 
11892 
15073 
8789 
2016 
5519 
6325 
12064 
4933 
30277 
30319 
25424 
20191 
22884 
7276 
3517 
1687 
646 
562 
599 
573 
682 
690 
589 
. 10 99 
i 2057 
860 
4677 
2762 
11529 
14065 
17113 
23595 
515 
1430 
1616 
1092 
2909 
3273 
1323 
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LQNQKKVEFKIDIWLAFQKA 
QNQKKVEFKID I WLAFQKAS 
NQKKVEFKIDIWLAFQKASS 
QKKVEFKIDIWLAFQKASSI 
KKVEFKIDIWLAFQKASSIV 
KVEFKIDIWLAFQKASS IVY 
VEFKIDIWLAFQKASSIVYK 
EFKIDIWLAFQKASSIVYKK 
FKID I WLAFQKAS S I VYKKE 
KID I WLAFQKAS S I VYKKEG 
ID I WLAFQKAS S I VYKKEGE 
D I WLAFQKAS S I VYKKEGEQ 
I WLAFQKAS S IVYKKEGEQV 
WLAFQKAS S I VYKKEGEQVE 
VLAFQKASSIVYKKEGEQVEF 
LAFQKAS S IVYKKEGEQVEFS 
AFQKASSIVYKKEGEQVEFSF 
FQKASS IVYKKEGEQVEFSFP 
QKASSIVYKKEGEQVEFSFPL 
KASSIVYKKEGEQVEFSFPLA 
AS S I VYKKEGEQVE FS F PLAF 
SSIVYKKEGEQVEFSFPLAFT 
S IVYKKEGEQVEFS FPLAFTV 
IVYKKEGEQVEFSFPLAFTVE 
VYKKEGEQVEFSFPLAFTVEK 
YKKEGEQVEFS FPLAFTVEKL 
KKEGEQVEFS FPLAFTVEKLT 
KEGEQVEFS FPLAFTVEKLTG 
EGEQVEFSFPLAFTVEKLTGS 
GEQVEFSFPLAFTVEKLTGSG 
EQVEF S FPLAFTVEKLTGS GE 
QVEFS FPLAFTVEKLTGSGEL 
VEFSFPLAFTVEKLTGSGELW 
EFSFPLAFTVEKLTGSGELWW 
FS FPLAFTVEKLTGSGELWWQ 
SFPLAFTVEKLTGSGELWWQA 
FPLAFTVEKLTGSGELWWQAE 
PLAFTVEKLTGSGELWWQAER 
LAFTVEKLTGSGELWWQAERA 
AFTVEKLTGSGELWWQAERAS 

Empty (control) 



1256 
1808 
1507 
759 
782 
635 
725 
649 
593 
1394 
962 
788 
. 646 
772 
1793 
1410 
3775 
9382 
24959 
30873 
25146 
28068 
8165 
15620 
2429 
735 
1847 
972 
739 
652 
765 
741 
633 
681 
4163 
2284 
6276 
2647 
3577 
1739 
617 



1=4 
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q 
TO 

w 



fy 

ru 

"5! 

Q 

ru 



63 

These second and third panels (panels B and C) provide data for 1 8-mers of a 
small region of CD4. 



Panel B 

PEPTIDE COUNTS 



LWDQGNFPLI I KNLK 


502 


WDQGNFPLI IKNLKI 


534 


DQGNFPLI IKNLKIE 


635 


OGNFPLI IKNLKI ED 


509 


GNFPLIIKNLKIEDS 


624 


NFPLIIKNLKIEDSD 


654 


FPT.T TTTMT.TTTF.DSDT 


539 

www 


P T . T T TCNTT .TC T TED S DT Y 


661 


T.T TTTNrT.TTTT?.'nc; i nTY'T 

XJX JLXN-LNJLJXV.JL ClLJiDU X JL JL 


542 


T T TTNTT .TC T "RD SDTY I C 

JL X JNJLN JLJJN.X Pi 1 / JL/ X X JL v-» 


664 


x jcvxnxjx^x, rii /ujj x x jl v-> i .i 


568 


TTMT jK" T P!D S DT Y T CEV 

XN-LHXJXVX Cll/iJX/ X X _L.s_.i-i V 


562 


NLKIEDSDTYICEVE 


1160 


LKT EDSDTYI CEVED 

X1A\>X t II./--? X/ X X JL W 1 11 * 


846 


KIEDSDTYICEVEDQ 


1088 


I EDSDTY I CE VEDQK 


1143 


EDSDTYI CEVEDQKE 


815 


DS DTY I CEVEDQKEE 


973 


SDTYI CEVEDQKEEV 


993 


DTYICEVEDQKEEVQ • 


1071 


TYICEVEDQKEEVQL. 


956 


YI CEVEDQKEEVQLL 


1064 


I CEVEDQKEE VQLLV 


1084 


CEVEDQKE EVQLLVF 


1729 


EVEDQKEEVQLLVFG 


2805 


VEDQKEEVQLLVFGL 


3816 


EDQKEEVQLLVFGLT 


3633 


DQKEEVQLLVFGLTA 


3905 


QKEEVQLLVFGLTAN 


3770 


KEEVQL.LVFGLTANS 


3485 


EEVQLLVFGLTANSD 


6423 


EVQLLVFGLTANSDT 


2689 


VQLLVFGLTANSDTH 


1006 


QLLVFGLTANSDTHL ; 


865 


LLVFGLTANSDTHIiL 


599 


LVFGLTANSDTHLLQ 


609 


VFGLTANS DTHLLQG 


532 


FGLTANSDTHLLQGQ 


625 
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GLTANSDTHLLQGQS 532 

LTANSDTHLLQGQSL 634 

TANSDTHLLQGQSLT 513 

ANSDTHLLQGQSLTL 542 

NSDTHLLQGQSLTLT 631 

SDTHLLQGQS LTLTL 747 

DTHLLQGQSLTLTLE 1622 

THLLQGQSLTLTLES 1874 

HLLQGQSLTLTLESP 1277 

- LWDQGNFPLIIKNLKIED 582 

WDQGNFPLI I KNLKIEDS 626 

DQGNFPLIIKNLKIEDSD 598 

QGNFPLIIKNLKIEDSDT 564 

GNFPLI IKNLKIEDSDTY 557 

>* NFPLIIKNLKIEDSDTYI 627 

Q FPLIIKNLKIEDSDTYIC 509 

2 PLIIKNLKIEDSDTYICE 624 

J LIIKNLKIEDSDTYICEV 634 

■gg' IIKNLKIEDSDTYICEVE 751 

H> * IKNLKIEDSDTYICEVED 699 

W KNLKIEDSDTYICEVEDQ 708 

^ NLKIEDSDTYICEVEDQK 863 

H LKtEDSDTYICEVEDQKE 872 

[y KI EDSDTYI CEVEDQKEE 858 

Sj IEDSDTYICEVEDQKEEV . 1230 

Q EDSDTYICEVEDQKEEVQ 788 

fU DSDTYI CEVEDQKEEVQL 961 

SDTYI CEVEDQKEE VQLL 870 

DTYICEVEDQKEEVQLLV 1648 

TY I CEVEDQKEEVQLLVF 3794! 

YICEVEDQKEEVQLLVFG 4611 

ICEVEDQKEEVQLLVFGL 3898 

CEVEDQKEEVQLLVFGLT 3797 

EVEDQKEEVQLLVFGLTA 3647 

VEDQKEEVQLLVFGLTAN 3913 

EDQKEEVQIiLVFGLTANS 3416 

DQKEEVQLLVFGLTANSD 3317 

QKEEVQLLVFGLTANSDT 3671 

KEEVQLLVFGIjTANSDTH 1271 

EEVQLLVFGLTANSDTHL 783 

EVQLLVFGLTANSDTHLL 667 

VQLLVFGLTANSDTHLLQ 673 

QLLVFGLTANSDTHLLQG 574 

LLVFGLTANS DTHLLQGQ 568 

LVFGLTANSDTHLLQGQS 564 
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VFGLTANSDTHLLQGQSL 


531 


FGLTANSDTHLLQGQSLT 


591 


GLTANSDTHLLQGQSLTL 


572 


LTANSDTHLLQGQSLTLT 


COO 

528 


TANSDTHLLQGQS LTLTL 


891 


ANSDTHLLQGQSIjTIj I i-il*. 


1 o*fu 


NSDTHLLQGQSLTLTLES 


1726 


SDTHLLQGQSLTLTLESP 


1260 


Ermafcv ( control ) 


575 


r anei \^ 




PEPTIDE 


COUNTS 


WTCTVLQNQKKVEFK 


- 

566 


TCTVLQNQKKVEFKI 


510 


CTVLQNQKKVEFKID 


608 


TVLQNQKKVEFKIDI 


587 


VLQNQKKVEFKID I V 


605 


LQNQKKVEFKIDIW 


644 


QNQKKVEFKIDIWL 


636 


NQKKVEFKIDIWLA 


860 


QKKVEFKIDIWLAF 


1333 


KKVEFKIDIWLAFQ 


951 


KVEFKIDIWIAFQK 


1051 


VEFKID I WLAFQKA 


1005 


EFKID I WLAFQKAS 


1188 


FKIDI WLAFQKASS 


1001 


KIDIWLAFQKASSI 


956 


IDI WLAFQKASS IV 


865 


DI WLAFQKASS IVY 


776 


I WLAFQKAS S I VYK 


783 


WLAFQKAS S I VYKK 


577 


VLAFQKASS IVYKKE 


634 


LAFQKAS S I VY.KKEG 


593 


AFQKAS S I VYKKEGE 


544 


FQKASS IVYKKEGEQ 


637 


QKASS I VYKKEGEQV 


519 


KASS I VYKKEGEQVE 


563 


AS S I VYKKEGEQVE F 


589 


S S I VYKKEGEQVE FS 


558 


SIVYKKEGEQVEFSF 


651 


I VYKKEGEQVEFS FP 


615 


VYKKEGEQVEFS FPL 


714 
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YKKEGEQVEFSFPLA 


687 




KKEGEQVEFS FPLAF 


1921 




KEGEQVEFSFPLAFT 


3253 




EGEQVEFSFPLAFTV 


3270 




GEQVEFS FPLAFTVE 


4656 




EQVEFS FPLAFTVEK 


4135 




QVEFSFPLAFTVEKL 


2047 




VEFSFPLAFTVEKLT 


899 




EFS FPLAFTVEKLTG 


920 




FS FPLAFTVEKLTGS 


672 




SFPLAFTVEKLTGSG 


565 




FPLAFTVEKLTGS GE 


556 




PLAFTVEKLTGSGEL 


612 




LAFTVEKLTGSGELW 


579 


10 T 


AFTVE KLTGS GELWW 


586 


FTVEKLTGSGELWWQ 


625 


w 

63 


TVEKLTGSGELWWQA 


550 


=p 


VEKLTGSGELWWQAE 


735 


EKLTGSGELWWQAER 


683 




WTCTVLQNQKKVEFKIDI 


588 


W 


. TCTVLQNQKKVEFKIDIV 


571 


=5 

JS55; 


CTVLQNQKKVEFKIDIW 


553 


fl s 


TVLQNQKKVEFKIDIWL 


655 


5 •ST 


VLQNQKKVEFKID I WLA 


724 


SJ 


LQNQKKVEFKID I WLAF 


938 


o 
ru 


QNQKKVEFKIDIWLAFQ 


917 


NQKKVEFKID I WLAFQK 


889 




QKKVEFKIDIWLAFQKA 


1013 




KKVEFKIDIWLAFQKAS 


912 




KVE FKI DI WLAFQKAS S 


1011 




VE FKID I WLAFQKAS S I 


819 




EFKIDI WLAFQKASS IV 


799 




FKIDIWLAFQKASSIVY 


843 




KIDIWLAFQKAS S IVYK 


779 




ID I WLAFQKAS S I VY KK 


711 




DIWLAFQKASSIVYKKE 


660 




I WLAFQKAS S I VYKKEG 


531 




WLAFQKAS S I VYKKEGE 


560 




VLAFQKASS IVYKKEGEQ 


549 




LAFQKASS I VYKKEGEQV 


665 




AFQKAS S I VYKKEGEQVE 


514 




FQKAS S I VYKKEGEQVE F 


528 




QKASS I VYKKEGEQVEFS 


602 




KASS I VYKKEGEQVEFS F 


536 




AS S I VYKKEGEQVEFS F P 


701 
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SSIVYKKEGEQVEFSFPL 756 

SIVYKKEGEQVEFSFPLA 771 

IVYKKEGEQVEFSFPLAF 5382 

VYKKEGEQVEFSFPLAFT 4307 

YKKEGEQVEFSFPLAFTV 4839 

KKEGEQVEFS FPLAFTVE 4683 

KEGEQVEFSFPLAFTVEK 3117 
EGEQVEFSFPLAFTVEKL 
GEQVEFS FPLAFTVEKLT 
EQVEFS FPLAFTVEKLTG 
QVEFSFPLAFTVEKLTGS 

VEFSFPLAFTVEKLTGSG 533 

EFSFPIAFTVEKLTGSGE 668 

FSFPLAFTVEKLTGSGEL 613 

M SFPLAFTVEKLTGSGELW 656 

FPLAFTVEKLTGSGELWW 586 



2164 
1643 
798 
736 



W PLAFTVEKLTGSGELWWQ 650 



yy 

-r- 

S . = 



s y 



LAFTVEKLTGSGELWWQA 866 
AFTVEKLTGSGELWWQAE 788 
FTVEKLTGSGELWWQAER 1143 
Empty (control) 556 

The fourth and fifth panels (Panels D and E) provide data for select 9-mers and 
12-mersofCD4. 



Panel D 

PEPTIDE COUNTS 



DQGNFPLIT 

QGNFPLIIK 

GNFPLIIKN 

NFPLIIKNL 

FPLIIKNLK 

PLIIKNIiKI 

LIIKNLKIE. 

IIKNLKIED 

IKNLKIEDS 

KNLKIEDSD 

NLKIEDSDT 

LKIEDSDTY 

KIEDSDTYI 

IEDSDTYIC 

EDSDTYICE 

DSDTYICEV 



662 

508 

600 

561 

601 

697 

515 

658 

557 

612 

512 

492 

603 

567 

650 

712 



SDTYICEVE 


819 


DTYICEVED 


1043 


TYICEVEDQ 


r\ r\ r~ 

805 


YICEVEDQK 


728 


ICEVEDQKE 


596 


CEVEDQKEE 


555 


EVEDQKEEV 


587 


VEDQKEEVQ 


521 


EDQKEEVQL 


564 


DQKEEVQLL 


589 


QKEEVQLLV 


636 


KEEVQLLVF 


1273 


EEVQLLVFG 


3170 


EVQLLVFGL 


2146 


VQLLVFGLT 


815 


QLLVFGLTA 


822 


LLVFGLTAN 


576 


LVFGLTANS 


522 


VFGLTANSD 


549 


FGLTANSDT 


563 


GLTANSDTH 


481 


LTANSDTHL 


596 


TANSDTHLL 


554 


ANSDTHLLQ 


642 


NSDTHLLQG 


561 


SDTHLLQGQ 


526 


DTHLLQGQS 


578 


THLLQGQSL 


512 


HLLQGQSLT 


564 


LLQGQSLTL 


568 


LQGQSLTLT 


501 


QGQSLTLTIj 


594 


GQSLTLTLE 


777 


DQGNFPLIIKNL 


604 


QGNFPLI IKNLK 


533 


GNFPLI IKNLKI 


547 


NFPLIIKNLKIE 


647 


FPLI IKNLKIED 


511 


PLIIKNLKIEDS 


565 


LIIKNLKIEDSD 


619 


I I KNLKI EDSDT 


511 


I KNLKI EDSDTY 


574 


KNLKIEDSDTYI 


523 


NLKIEDSDTYIC 


639 


LKIEDSDTYICE 


635 
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KIEDSDTYICEV 601 

IEDSDTYICEVE 1107 

EDSDTYICEVED 956 

DSDTYI CEVEDQ 937 

SDTY I CEVEDQK 846 

DTY I CEVEDQKE 720 

TYI CEVEDQKEE 818 

Y I CEVEDQKEEV 734 

I CEVEDQKEEVQ 585 

CEVEDQKEEVQL 561 

EVEDQKEEVQLL .508 

VEDQKEEVQLLV 657 

EDQKEEVQLLVF ... 1379 

DQKEEVQLLVFG 1624 

H=. QKEEVQLLVFGL 1785 

O KEEVQLLVFGLT . 1774 

P EEVQLLVFGLTA 3261 

*5 EVQLLVFGLTAN 1838 

j VQLLVFGLTANS 747 

il ' QLLVFGLTANSD 721 

LU LLVFGLTANSDT 533 

m LVFGLTANSDTH 586 

0 VFGLTANSDTHL 548 

[H FGLTANSDTHLL 571 

y GLTANSDTHLLQ 574 

□ LTANSDTHLLQG 534 

m TANSDTHLLQGQ 549 

ANSDTHLLQGQS 559 

NSDTHLLQGQSL 585 

SDTHLLQGQSLT 540 

DTHLLQGQSLTL 527 

TKLLQGQSLTLT 646 

HLLQGQSLTLTL 701 

LLQGQSLTLTLE 1320 

Empty (control) 581 

Panel E 

PEPTIDE COUNTS 

TVLQNQKKV 534 

VLQNQKKVE 556 

LQNQKKVEF 565 

QNQKKVEFK 537 

NQKKVEFKI 597 



QKKVEFKID 


575 


KKVEFKIDI 


501 


KVEFKIDIV . 


555 


VEFKIDIW 


548 


EFKIDIWL 


665 


FKIDIWIA 


568 


KIDIWLAF . 


665 


IDIWLAFQ 


691 


DIWLAFQK 


686 


IWLAFQKA 


602 


WLAFQKAS 


600 


VLAFQKASS 


466 


LAFQKASSI 


592 


AFQKASSIV 


595 


FQKASSIVY 


568 


QKASSIVYK 


494 


KASSIVYKK 


498 


ASSIVYKKE 


600 


SSIVYKKEG 


515 


SIVYKKEGE 


566 


IVYKKEGEQ 


534 


VYKKEGEQV 


490 


YKKEGEQVE 


518 


KKEGEQVEF 


546 


KEGEQVEFS 


595 


EGEQVEFSF 


735 


GEQVEFSFP 


697 


EQVEFSFPL 


1032 


QVEFSFPLA 


1205 


VEFSFPLAF 


1064 


EFSFPLAFT 


658 


FSFPLAFTV 


472 


SFPLAFTVE 


619 


FPLAFTVEK 


569 


PLAFTVEKL 


597 


LAFTVEKLT 


501 


AFTVEKLTG 


517 


FTVEKLTGS 


574 


TVEKLTGSG 


487 


VEKLTGSGE 


585 


EKLTGSGEL 


541 


KLTGSGELW 


491 


LTGSGELWW . 


550 


TGSGELWWQ 


507 


TVLQNQKKVE FK 


563 
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VLQNQKKVEFKI 503 

LQNQKKVEFKID 508 

QNQKKVEFKIDI 559 

NQKKVEFKIDIV 532- 

QKKVEFKIDIW 595 

KKVEFKIDIWL 597 

KVEFKIDIWLA 560 

VEFKIDIWLAF 681 

EFKIDIWLAFQ 659 

FKI D I WLAFQK 736 

KI D I WLAFQKA 689 

ID I WLAFQKAS 630 

DIWLAFQKASS 746 

IWIiAFQKASSI 548 

m= wlafqkas s iv 567 

•p. vlafqkass ivy 548 

p lafqkassivyk 465 

j afqkas s i vykk 597 

£ fqkas s i vykke 577 

p , qkass ivykkeg 596 

w - kassivykkege 559 

=l ass ivykkegeq 523 

It; ssivykkegeqv 615 

s i vykkegeqve 543 

Si IVYKKEGEQVEF 533 

p VYKKEGEQVE FS 584 

fU YKKEGEQVEFSF 548 

KKEGEQVEFSFP 598 

KEGEQVEFSFPLi 710 
EGEQVEFSFPIiA . 1456 

GEQVEFSFPLAF 1729'.. 

. EQVEFSFPLAFT 1556 

QVEFSFPLAFTV 1636 

VEFSFPLAFTVE 518 

EFSFPLAFTVEK 585 

FSFPLAFTVEKL 573 

SFPLAFTVEKLT 528 

FPLiAFTVEKLTG 622 

PLAFTVEKLTGS 528 

LAFTVEKLTGSG 608 

AFTVEKLTGSGE 511 

FTVEKLTGSGEL 530 

TVEKLTGSGELW 573 

VEKLTGSGELWW 477 

EKLTGSGELWWQ 543 



72 



Empty 571 
(control) 

Panels F and G provide data on sequential alanine 
replacements for selected CD4 polypeptides. 



Panel F 




PEPTIDE 


COUNTS 


ZZZ ZZ ZDTYICEVED 


5844 


ZZZZZZATYICEVED 


5921 


ZZZZZZDAYICEVED 


6362 


ZZZZZZDTAICEVED 


1301 


Z Z Z Z Z ZDTYACEVED 


2583 


ZZZZZZDTYIAEVED 


4483 


ZZZZZZDTYICAVED 


3154 


ZZZZZZDTYICEAED 


3432 


ZZZZZZDTYICEVAD 


3595 


ZZZZZZDTYICEVEA 


5942 


ZZZZZZDTYICEVED 


4973 


Z ZZ Z Z ZDTYI CEVED 


4775 


ZZZZZZATYICEVED 


4962 


ZZZZZZDAYICEVED 


4163 


ZZZZZZDTAICEVED 


1384 


ZZZZZZDTYACEVED 


3085 


ZZZZZZDTYIAEVED 


5128 


ZZZZZZDTYICAVED 


2587 


ZZZZZZDTYICEAED 


2499 


ZZZZZZDTYICEVAD 


2706 


ZZZZZZDTYICEVEA - 


6345 


ZZZZZZDTYICEVED 


5564 


EEVQLLVFGLTANSD 


18582 


AEVQLLVFGLTANSD 


16220 


EAVQLLVFGLTANSD 


14220 


EEAQLDVFGLTANSD 


18124 


EEVALLVFGLTANSD 


10890 


EEVQALVFGIiTANSD 


11258 


EEVQIAVFGLTANSD 


11954 


EEVQLIiAFGLTANSD 


13317 


EEVQLLVAGLTANSD 


9573 


EEVQLLVFALTANSD 


19348 


EEVQLLVFGATANSD 


10408 


EEVQLLVFGLAANSD 


19973 



EEVQLLVFGLTTNSD 

EEVQLLVFGLTAASD 

EEVQLLVFGLTANAD 

EEVQLLVFGLTANSA 

EEVQLLVFGLTANSD 

EEVQLLVFGIiTANSD 

AEVQLLVFGLTANSD 

EAVQLLVFGLTANSD 

EEAQLLVFGLTANSD 

EEVALLVFGLTANSD 

EEVQALVFGLTANSD 

EEVQLAVFGLTANSD 

EEVQLLAFGLTANSD 

EEVQLLVAGLTANSD 

EEVQLLVFALTANSD 

EEVQLLVFGATANSD 

EEVQLLVFGLAANSD 

EEVQIiLVFGLTTNSD 

EEVQLLVFGLTAASD 

EEVQLLVFGLTANAD 

EEVQLLVFGLTANSA 

EEVQLLVFGLTANSD 

THLLQGQSLTLTLES 

AHLLQGQSLTLTLES 

TALLQGQSLTLTLES 

THALQGQSLTLTLES 

THLAQGQSLTLTLES 

THLLAGQSLTLTLES 

THLLQAQSLTLTLES 

THLLQGASLTLTLES 

THLLQGQALTLTLES 

THLLQGQSATLTLES 

THLLQGQSLALTLES 

THLLQGQSLTATLES 

THLLQGQSLTLALES 

THLLQGQSLTLTAES 

THLLQGQSLTLTLAS 

THLLQGQSLTLTLEA 

THLLQGQSLTLTLES 

THLLQGQSLTLTLES 

AHLLQGQSLTLTLES 

TALLQGQSLTLTLES 

THALQGQSLTLTLES 

THLAQGQSLTLTLES 

THLLAGQSLTLTLES 
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20100 
19390 
17684 
18227 
19738 
21338 
14590 
13213 
16296 
13415 
12603 
13690 
16286 
11480 
18254 
19978 
18863 
20021 
19200 
17928 
22206 
18721 
7756 
8602 
6931 
7683 
7701 
4578 
8471 
4238 
8659 
4430 
8158 
4380 
11699 
862 
2596 
5849 
6545 
4787 
5826 
5012 
5059 
5120 
2956 



THLLQAQSLTLTLES 


6393 


THLLQGASLTLTLES 


a r\ o o 

1933 


THLLQGQALTLTLE S 


5151 


THLLQGQSATLTLES 


1391 


THLLQGQSLALTLES 


4749 


THLLQGQSLTATLES 


813 


THLLQGQSLTLALES 


8147 


THLLQGQSLTLTAES 


797 


THLLQGQSLTLTLAS 


2193 


THLLQGQSLTLTLEA 


7984 


ipiiT ,t .nfiAQT .TT .TT <? 


5947 


ciuvp ty ^ contiox i 


569 


Panel G 




PEPTIDE 


COUNTS 


GEQVEFS FPIiAFTVE 


20691 


AEQVEFS FPIiAFTVE 


18546 


GAQVEFSFPLAFTVE 


17733 


GEAVE F S F PLAFTVE 


17500 


GEQAEFS FPLAFTVE 


14764 


GEQVAFS FPLAFTVE 


16668 


GEQVEAS FPLAFTVE 


6793 


GEQVEFAFPLAFTVE 


21681 


GEQVEFSAPLAFTVE 


7767 


GE Q VE F S F ALAFTVE 


20480 


GEQVEFSFPAAFTVE 


10024 


GEQVEFS FPLTFTVE 


17397 


GEQVEFSFPLAATVE 


10130 


GEQVEFS FPLAFAVE 


20627 


GEQVEFS FPLAFTAE 


18797 


GEQVEFS FPLAFTVA 


18371 


GEQVEFS FPLAFTVE 


17662 


GEQVEFS FPLAFTVE 


19190 


AEQVEFS FPLAFTVE 


18042 


GAQVEFSFPLAFTVE . 


18079 


GEAVEFSFPLAFTVE 


19756 


GEQAEFS FPLAFTVE 


13000 


GEQVAFS FPLAPTVE 


13930 


GEQVEAS FPLAFTVE 


6533 


GEQVEFAFPLAFTVE 


20072 


GEQVEFSAPLAFTVE 


7378 


GEQVEFSFALAFTVE 


19480 


GEQVEFSFPAAFTVE 


10589 
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GEQVEFSFPLTFTVE 18318 

GEQVEFS FPLAATVE 9572 

GEQVEFS FPLAFAVE 19516 

GEQVEFS FPLAFTAE 16765 

GEQVEFS FPLAFTVA 18187 

GEQVEFSFPLAFTVE 18219 

ZZZZZZDTYICEVED 5017 

ZZZZZZDTYICEVEZ 5421 

ZZZZZZDTYICEVZZ 2166 

ZZZZZZDTYICEZZZ 922 

ZZZZZZDTYIZZZZZ 564 

ZZZZZZZTYICEVED 3031 

EEVQLLVFGLTANSD 23357 

EEVQLLVFGLTANS Z 15808 

EEVQIiLVFGLTANZ Z 16496 

EEVQLLVFGLTAZZZ ' 14097 

EEVQLLVFGLTZZZZ 16473 

EEVQLLVFGLZZZZZ 10516 

: EEVQLLVFGZZZZZZ 10372 

EEVQLLVFZZZZZZZ 7333 

EEVQL.LVZZZZZZZZ 1098 

ZEVQLiLVFGIiTANSD 16716 

ZZVQLLVFGLTANSD 5281 

ZZZQLLVFGLTAMSD 4310 

ZZZZLLVFGLTANSD 1026 

ZZZZZLVFGLTANSD 664 

ZZZZZZVFGLTANSD 779 

ZZZZZZZFGLTANSD 760 

• ZZZZZZZZGLTANSD 657 

EEVQLLVFGLTANSD 18040 

THLLQGQSLTLTLES 1 0850 

THLLQGQSLTLTLEZ' 10269 

THLLQGQSLTLTLZZ 4668 

THLLQGQSLTLTZZZ 908 

THLLQGQSLTLZZZZ 844 

THLLQGQSLTZZZZZ 475 

THLLQGQSLZZZZZZ 548 

THLLQGQSZZZZZZZ 570 

THLLQGQZZZZZZZZ 442 

ZHLLQGQSLTLTLES .11445 

ZZLLQGQSLTLTLES 11631 

ZZZLQGQSLTLTLES 7993 

ZZZZQGQSLTLTLES 6887 

Z Z Z Z ZGQS LTLTLES 3305 

ZZZZZZQSLTLTLES 4453 
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ZZZZZZZSLTLTLES 

ZZZZZZZZLTLTLES 

THLLQGQSLTLTLES 

GEQVEFS FPLAFTVE 

GEQVEFS FPLAFTVZ 

GEQVEFSFPLAFTZ Z 

GEQVEFSFPLAFZZZ 

GEQVEFSFPLAZZZZ 

GEQVEFSFPLZZZZZ 

GEQVEFSFPZZZZZZ 

GEQVEFSFZZZZZZZ 

GEQVEFSZZZZZZZZ 

ZEQVEFSFPLAFTVE 

ZZQVEFS FPLAFTVE 

ZZZVEFSFPLAFTVE 

ZZZZEFSFPLAFTVE 

ZZZZZFSFPLAFTVE 

ZZZZZZSFPLAFTVE 

ZZZZZZZFPLAFTVE 

ZZZZZZZZPLAFTVE 

GEQVEFS FPLAFTVE 

empty (control) 



1086 
1201 
9756 
18856 
16222 
12535 
11384 
5846 
4749 
2208 
3277 
742 
19736 
18684 
12892 
12166 
2134 
1454 
1391 
1489 
18867 
580 



Example 11 

This example characterizes CD4 receptor sequences found 
to have HIV gpl20 binding activity in screening tests. 
Panel A displays information obtained from sequential 
replacement of amino acid residues by alaninyl residues, 
in panel A, a ('+) signifies a decrease in binding 
affinity whereas a (>) indicates that replacement of the 
residue by an alaninyl residue yields an increase in 
binding affinity. Sequences are shown with amino- 
terminus at the top and the carboxyl- terminus at the 
bottom. ■ Right and left sides are from independent 
assays.. . 



Panel A. 



105-113 


116-130 


131-145 


216-229 


D 


E 


T 


G 



77 



T 


E 


H 


E 


++Y++ 


V 


L 


Q 


+1+ 


+Q+ 


L ' 


+V+ 


C 


+L+ 


+Q+ 


+E+ 


+E+ 


+L+ 


G 


++F++ 


+V+ 


+V+ 


+Q+ 


S 


+E+ 


+F+ 




++F++ 


D 


G 


+L+ 


• P 




+L . 


T 


++L++ 




- T 


+L++ 


- A 




A 


>T> 


++F++ 




N 


+++L+++ 


T 




S 


++E++ 


V 




D 


S 


E 



Panel B indicates the effect on binding affinity when 
successive amino acid residues are deleted, either from 
the amino -terminus (right side-symbols) or the carboxyl- 
terminus from the bottom (left side-symbol) . A ( + ) 
signifies a decrease in binding affinity, and the 
underlined residues indicate which residue was the last 
residue to be serially deleted. 



Panel B . 



105^113 


116-130 


131-145 


216-229 


D+ 


E 


T 


G 


T 


E+ 


H 


E 


Y 


V+ 




Q+ 


I 


Q++ 


L+ 


v+ 


C 


L+++ 


Q++ 


E+++ 


+++E 


L+++ 


G++ 


F+++ 


++V 


V+++ 


Q+++ 


S++++ 


+E 


++++F++++ 


+++S+++ 


++++F++++ . 


D 


++G 


+++L 


+++P 




+L 


+++T 


+++L 




T 


+++L 


++A 




A 


++T 


++F 




N 


++L 


+T 




S 


+E 


+V 




D 


S 


E 
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All publications cited herein are hereby 
incorporated by reference to the same extent as if each 
publication were individually and specifically indicated 
to be incorporated by reference and were set forth in its 
entirety herein. 

While this invention has been described with an 
emphasis upon preferred embodiments, it will be obvious 
to those of ordinary skill in the art that variations of 
the preferred embodiments can be used and that it is 
intended that the invention can be practiced otherwise 
than as specifically described herein. Accordingly, this 
invention includes all modifications- encompassed within 
the spirit and scope of the invention as defined by the 
following claims . 



